WO2023053480A1 - 情報処理装置、情報処理方法およびプログラム - Google Patents
情報処理装置、情報処理方法およびプログラム Download PDFInfo
- Publication number
- WO2023053480A1 WO2023053480A1 PCT/JP2022/006048 JP2022006048W WO2023053480A1 WO 2023053480 A1 WO2023053480 A1 WO 2023053480A1 JP 2022006048 W JP2022006048 W JP 2022006048W WO 2023053480 A1 WO2023053480 A1 WO 2023053480A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- sound source
- signal
- additional information
- sound
- mixed
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 52
- 238000003672 processing method Methods 0.000 title claims description 15
- 230000005236 sound signal Effects 0.000 claims abstract description 91
- 239000000284 extract Substances 0.000 claims abstract description 15
- 238000000926 separation method Methods 0.000 claims description 70
- 238000012545 processing Methods 0.000 claims description 48
- 230000006870 function Effects 0.000 claims description 34
- 238000010586 diagram Methods 0.000 description 11
- 238000000034 method Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 4
- 230000004048 modification Effects 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005401 electroluminescence Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
Definitions
- the present disclosure relates to an information processing device, an information processing method, and a program.
- One object of the present disclosure is to provide an information processing device, an information processing method, and a program capable of appropriately detecting information embedded in a signal, such as digital watermark information.
- the present disclosure for example, a decoder for extracting additional information contained in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model;
- This is an information processing device in which changes in the sound source signal and mixed sound signal due to the addition of additional information are rendered imperceptible.
- a decoder extracts additional information included in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model
- a decoder extracts additional information included in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model
- This is a program that causes a computer to execute an information processing method in which changes in the sound source signal and mixed sound signal due to the addition of additional information are rendered imperceptible.
- a concealer that includes additional information by applying a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal obtained by mixing the plurality of sound source signals;
- This is an information processing device in which changes in the sound source signal and mixed sound signal due to the addition of additional information are rendered imperceptible.
- a concealer applies a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal in which the plurality of sound source signals are mixed to perform processing to include additional information
- This is an information processing method in which changes in the sound source signal and mixed sound signal due to the addition of additional information are rendered imperceptible.
- a concealer applies a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal in which the plurality of sound source signals are mixed to perform processing to include additional information
- This is a program that causes a computer to execute an information processing method in which changes in the sound source signal and mixed sound signal due to the addition of additional information are rendered imperceptible.
- FIG. 1 is a diagram for explaining an outline of a system according to an embodiment.
- FIG. 2 is a block diagram showing a configuration example of the playback device according to the embodiment.
- FIG. 3 is a diagram for explaining a system configuration example according to the embodiment.
- FIG. 4 is a diagram for explaining another system configuration example according to the embodiment.
- FIG. 5 is a diagram for explaining another system configuration example according to the embodiment.
- FIG. 6 is a diagram that is referenced when a learning model that can be applied by the concealer and decoder according to the embodiment is described.
- FIG. 7 is a diagram that is referred to when describing the effects obtained by the embodiment.
- FIG. 8A is a diagram for explaining an example of a UI (User Interface) applicable to the present disclosure.
- FIG. 8B is a diagram for explaining an example of a UI applicable to the present disclosure;
- FIG. 1 shows a configuration example of a playback system (playback system 1) according to one embodiment.
- a reproduction system 1 has a distribution device 2 and a reproduction device 3 that are connected via a network NW.
- a plurality of distribution devices 2 and reproduction devices 3 may be provided.
- the distribution device 2 and the reproduction device 3 correspond to an example of an information processing device.
- a typical example of the network NW is the Internet, but any network such as a LAN (Local Area Network), Bluetooth (registered trademark), Wi-Fi (registered trademark), or the like may be used.
- the distribution device 2 and the playback device 3 may be connected by a wire.
- the playback device 3 is described as a smart phone, but the playback device 3 may be anything such as a personal computer, a hearable device such as headphones or earphones, or other wearable device.
- the distribution device 2 applies a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal in which the plurality of sound source signals are mixed to create a concealer that includes additional information. have.
- changes in the sound source signal and the mixed sound signal due to the addition of the additional information are made imperceptible.
- a mixed sound signal including additional information is transmitted to the reproducing device 3 via the network NW.
- the playback device 3 has, for example, a decoder that extracts additional information included in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model. Changes in the source signal and the mixed sound signal due to the addition of the additional information are made imperceptible.
- imperceptible means that the change in the sound source signal and mixed sound signal due to the addition of the additional information is at a level at which the user (the listener of the sound source signal and mixed sound signal) cannot perceive.
- the concealer and decoder consist of neural networks. Also, as the learning model applied to the concealer and the learning model applied to the decoder, the same learning model obtained by learning described later can be applied.
- electronic watermark information will be described as additional information as an example. It may be watermark information.
- FIG. 2 is a block diagram showing a configuration example of the playback device 3 according to one embodiment.
- the playback device 3 includes a control unit 301, a microphone 302A, an audio signal processing unit 302B connected to the microphone 302A, a camera unit 303A, a camera signal processing unit 303B connected to the camera unit 303A, and a network unit 304A.
- a network signal processing unit 304B connected to the network unit 304A, a speaker 305A, an audio reproduction unit 305B connected to the speaker 305A, a display 306A, and a screen display unit 306B connected to the display 306A.
- Audio signal processing unit 302B, camera signal processing unit 303B, network signal processing unit 304B, audio reproduction unit 305B, and screen display unit 306B are connected to control unit 301, respectively.
- the control unit 301 is composed of a CPU (Central Processing Unit) and the like.
- the control unit 301 has a ROM (Read Only Memory) in which the program is stored, a RAM (Random Access Memory) used as a work area when the program is executed, and the like (not shown). there is.).
- the control unit 301 controls the playback device 3 in an integrated manner.
- the microphone 302A picks up the user's speech and the like.
- the audio signal processing unit 302B performs known audio signal processing on audio data of sounds picked up via the microphone 302A.
- the camera unit 303A includes an optical system such as a lens, an imaging device, and the like.
- the camera signal processing unit 303B performs A/D (Analog to Digital) conversion processing, various types of correction processing, object detection processing, etc. on images (either still images or moving images) acquired via the camera unit 303A. image signal processing.
- A/D Analog to Digital
- the network unit 304A includes an antenna and the like.
- the network signal processing unit 304B performs modulation/demodulation processing, error correction processing, and the like on data transmitted/received via the network unit 304A.
- the audio reproduction unit 305B performs processing for reproducing sound from the speaker 305A.
- the audio reproducing unit 305B performs, for example, amplification processing and D/A conversion processing. Also, the audio reproducing unit 305B performs processing for generating sounds to be reproduced from the speaker 305A.
- the audio reproduction unit 305B includes a sound source separation unit 305C.
- the sound source separation unit 305C performs sound source separation on a mixed sound signal in which a plurality of sound source signals are mixed and a perturbation optimized to improve the performance of sound source separation is mixed.
- the control unit 301 performs control so as not to operate the sound source separation unit 305C.
- a part of the separated signals obtained by the sound source separation may be reproduced from the speaker 305A, or each separated signal may be stored in an appropriate memory.
- the audio reproducing unit 305B includes a decoder 305D.
- the decoder 305D extracts digital watermark information from a mixed sound signal containing a plurality of sound source signals or a separated signal obtained by separating sound sources from the mixed sound signal.
- the sound source separation unit 305C and the decoder 305D do not necessarily have to be included in the audio reproduction unit 305B.
- the decoder 305 may be composed of one decoder or may be composed of a plurality of decoders.
- An LCD (Liquid Crystal Display) or an organic EL (Electro Luminescence) display can be applied as the display 306A.
- the screen display unit 306B performs known processing for displaying various information on the display 306A.
- the display 306A may be configured as a touch panel. In this case, the screen display unit 306B also performs detection processing of the operation position associated with the touch operation.
- the first example is an example in which a mixed sound signal x is composed of three sound source signals (sound source signals XA, XB, and XC).
- the sound source signal XA is, for example, a drum sound source signal
- the sound source signal XB is, for example, a vocal sound source signal
- the sound source signal XC is, for example, a piano sound source signal.
- the distribution device 2 has a concealer 41 and an adder 61, for example.
- the playback device 3 has a decoder 51 corresponding to the decoder 305D described above.
- the concealer 41 applies the learning model 100 to a part of the sound source signals, for example the sound source signal XA of the drums, to include (embed) the digital watermark information WI.
- the adder 61 adds the drum sound source signal XA including the electronic watermark information WI, the vocal sound source signal XB not including the electronic watermark information WI, and the piano sound source signal XC not including the electronic watermark information WI. generates a mixed sound signal X.
- a mixed sound signal X is transmitted to the reproduction device 3 via a communication unit (not shown). Note that the mixed sound signal X may be compressed by an appropriate method.
- the reproduction device 3 receives the mixed sound signal X via a communication unit (not shown).
- the decoder 51 applies the learning model 100 to extract the digital watermark information WI included in the mixed sound signal X.
- the concealer 41 causes the mixed sound signal X to contain digital watermark information WI.
- the sound source signals forming the mixed sound signal X the same sound source signals (sound source signals XA to XC) as in the first example can be applied.
- the playback device 3 has decoders 51A to 51C, for example.
- the decoders 51A-51C extract the digital watermark information WI included in the corresponding sound source signal.
- the mixed sound signal X received by the playback device 3 is separated into separated signals XA' to XC' by sound source separation by the sound source separation unit 305C.
- the sound source separation unit 305C performs sound source separation by applying a DNN-based sound source separation model, for example.
- the digital watermark information WI embedded in the mixed sound signal X can be left in the sound source separation result.
- the decoder 51A applies the learning model 100 to extract the digital watermark information WI included in the separated signal XA'.
- the decoder 51B applies the learning model 100 to extract the digital watermark information WI included in the separated signal XB'.
- the decoder 51C applies the learning model 100 to extract the digital watermark information WI included in the separated signal XC'.
- a third example is an example in which the mixed sound signal X is composed of a drum sound source signal XA and a piano sound source signal XB, and digital watermark information is embedded in each sound source signal.
- the distribution device 2 also has concealers 41A and 41B corresponding to each sound source signal.
- the concealer 41A applies the learning model 100 to embed the digital watermark information WIa in the sound source signal XA.
- the concealer 41B embeds the electronic watermark information WIb in the sound source signal XB by applying the learning model 100.
- the concealer may include a plurality of concealers that are processed in parallel. A single concealer may sequentially perform the process of embedding the electronic watermark information WIa and WIb.
- the mixed sound signal X in this example is generated by adding the two signals in which the digital watermark information is embedded by the adder 62 .
- the mixed sound signal X is subjected to sound source separation processing by the sound source separation unit 305C, thereby generating a separated signal XA' and a separated signal XB'.
- the decoder 51A extracts the digital watermark information WIa included in the separated signal XA' by performing processing applying the learning model 100 to the separated signal XA'.
- the decoder 51B extracts the digital watermark information WIb included in the separated signal XB' by applying the learning model 100 to the separated signal XB'.
- digital watermark information may be embedded only in a part of the sound source signal (for example, the sound source signal XA of the drums), or a part or all of the sound source signal and the mixed sound signal may be embedded. and may be embedded with electronic watermark information.
- FIG. 6 is a diagram generalizing the processing performed in this embodiment.
- the illustration includes all the processing that can be performed in this embodiment, but all the processing shown in FIG. no.
- FIG. 6 shows concealer indicates a decoder. again, indicates the source signal.
- this sound source signal may be a single sound source signal or a mixed sound signal of a plurality of sound source signals. again, indicates digital watermark information.
- the mixed sound signal teeth It can be expressed as.
- the sound source separation result is It can be expressed as.
- the sound source separation model f is a multiple sound source separation model may be a weighted average of
- the digital watermark information obtained by decoding the sound source signal without separating the sound source is can be expressed as
- the digital watermark information obtained by decoding the separated signal, which is the result of sound source separation, is It can be expressed as.
- the learning model 100 is obtained by learning to minimize the loss function L shown in Equation (1) below. Based on the results of such learning, parameters of concealer and decoder neural networks are optimized. As an optimization method, for example, a stochastic gradient method can be used. formula (1)
- the learning model 100 includes an error function between sound source signals before and after including digital watermark information in the sound source signal, a signal including the digital watermark information in the sound source signal, and the sound source signal obtained by sound source separation.
- the first term on the right side of equation (1) is for making the acoustic properties (frequency characteristics, etc.) of the sound source signal before the digital watermark information embedded and the sound source signal after the digital watermark information embedded become as close as possible. term.
- the second term on the right side of Equation (1) is a term for making the sound source signal before sound source separation and the sound source signal after sound source separation as close as possible in terms of acoustic properties.
- the third term on the right side of equation (1) is for making the digital watermark information obtained as a result of decoding without mixing with other sound source signals and sound source separation as close as possible to the original digital watermark information. term.
- the fourth term on the right side of equation (1) is a term for making the digital watermark information obtained as a result of decoding after sound source separation as close as possible to the original digital watermark information.
- the signal embedded with digital watermark information is substantially the same as the original signal, that is, the change from the original signal is imperceptible. can.
- the digital watermark information can be imperceptible information.
- the electronic watermark information should not leak out to the other sound source signals (do not have an adverse effect). can do.
- all the sound source signals after separation can be substantially the same as the sound source signals before separation (the change due to the electronic watermark information is imperceptible).
- SDR Signal to Distortion Ratio
- CER Charge Error Rate
- Pattern 1 Random
- Pattern 2 Baseline (original): A learning model obtained as a result of learning with a loss function that does not consider sound source separation processing (a loss function with only 1 and 3 items on the right side of Equation (1)) is applied to the concealer and decoder.
- Pattern 3 STSS(separation): A learning model obtained as a result of learning with a loss function that considers only sound source separation processing (a loss function with only the 2nd and 4th items on the right side of Equation (1)) is applied to the concealer and decoder.
- Pattern 4 STSS(separation+original): A pattern in which a learning model obtained as a result of learning with the loss function shown in Equation (1) is applied to the concealer and decoder.
- Pattern 1 the CER results were 96.3% for both "original” and “separation", indicating poor accuracy.
- the SDR and CER "original” results were good.
- Pattern 3 is a learning model obtained using a loss function that considers sound source separation processing, so the results of "separation” of SDR and CER are good.
- the “original” results of SDR and CER are 34.6% and 38.5%, respectively.
- pattern 4 the results of SDR "original” and “separation” were good at 35.2% and 37.9%, respectively. In general, when the SDR exceeds 30%, changes in the sound source signal and mixed sound signal due to the embedding of digital watermark information become imperceptible.
- This experiment confirmed that processing using the learning model 100 satisfies the criteria. Furthermore, both the CER results of "original” and “separation” were good at 0.0%. From the above, it was confirmed that pattern 4, that is, the precision of the processing result by the concealer and decoder using the learning model 100 according to the present embodiment is the highest.
- FIGS. 8A and 8B show an example of a UI (User Interface) applicable to this embodiment.
- the UI shown in FIGS. 8A and 8B is displayed, for example, on a display device.
- FIG. 8A shows a UI mainly used on the distribution side
- FIG. 8B shows both UIs used on the playback side.
- a UI 61 used by creator A (content creator) on the distribution side includes, for example, a display 61A for designating embedded electronic watermark information, a display 61B for designating an audio file for embedding electronic watermark information, and a display 61C for designating embedding strength.
- a button 61D that initiates the process of embedding digital watermark information in the specified audio file.
- Display 61C includes, for example, radio buttons respectively corresponding to embedding strength "strong” and embedding strength "weak”.
- the embedding strength “strong” means embedding the digital watermark information so that the sound source signals before and after the sound source separation are as close as possible (so that the change due to the digital watermark information is as imperceptible as possible).
- the embedding strength "weak” means that the sound source signal before and after sound source separation is slightly different (a change due to digital watermark information is slightly perceived).
- learning models corresponding to embedding strengths "strong” and “weak” can be obtained by changing the value of ⁇ in equation (1).
- the learning model used by the concealer is switched according to the user's choice of embedding strength. That is, by allowing the user to select the embedding strength, it is possible to select the learning model to be applied when including the digital watermark information.
- the UI used by author B also includes similar display elements.
- a mixed sound signal is generated by mixing sound source signals embedded with digital watermark information by producers A and B using a mixing tool.
- the UI 62 related to the mixing tool includes, for example, waveforms of each sound source signal.
- a known UI can be applied as the UI of the mixing tool. Effects and the like are applied by the mixing tool.
- the distributed mixed sound signal is subjected to sound source separation by sound source separation software.
- a display 71 containing the waveform of each separated signal is displayed on a suitable display.
- the UI 72 is displayed on the display section by executing the digital watermark information extraction software.
- the UI 72 includes, for example, a waveform display 72A of a separated sound source and a display 72B of extracted digital watermark information (character strings in this embodiment).
- the learning model 100 is not applied, the digital watermark information is not extracted and the display 72B becomes blank or displays an error.
- the extraction of the character string corresponding to the digital watermark information can prevent illegal distribution of the separated signal, not limited to such a usage example. That is, if a sound source signal downloaded from a site illegally distributing a sound source signal is subjected to electronic watermark information extraction software to extract electronic watermark information, it is evidence that the sound source signal is not distributed under license. In addition to this, various applications using digital watermark information are possible.
- Equation (2) differs from Equation (1) in that the fifth term on the right side of Equation (2) is added. however, and is the decoder, is the sound source separator for sound source i, indicate other sound source signals with respect to the sound source signal obtained by the sound source separation result. Also, ⁇ is a weighting factor, which is an experimentally determined value.
- Some of the processing described in one embodiment may be performed by a device other than the distribution device and the playback device, such as a server. Any number or type of sound source signals can be applied in the mixed sound signal.
- the present disclosure can also be implemented in any form such as a device, method, program, system, and the like.
- a program that performs the functions described in the above embodiments, and by downloading and installing the program in a device that does not have the functions described in the embodiments, the device can perform the control described in the embodiments. can be done.
- the present disclosure can also be implemented by a server that distributes such programs.
- the items described in each embodiment and modifications can be combined as appropriate.
- the contents of the present disclosure should not be construed as being limited by the effects exemplified herein.
- the present disclosure can also adopt the following configurations.
- the additional information is information added to a mixed sound signal obtained by mixing the plurality of sound source signals.
- the decoder extracts the additional information from a separated signal obtained by subjecting the mixed sound signal to sound source separation processing.
- the additional information is included in at least one sound source signal forming the mixed sound signal.
- the information processing device wherein the additional information is included in each sound source signal that constitutes the mixed sound signal.
- the information processing apparatus further comprising a sound source separation unit that performs the sound source separation process.
- the learning model includes an error function between sound source signals before and after including additional information in the sound source signal, a difference between a signal including additional information in the sound source signal and a signal corresponding to the sound source signal obtained by sound source separation. , the error function between additional information before and after being included in the sound source signal, and the error between the additional information before being included in the sound source signal and the additional information included in the sound source signal obtained by sound source separation.
- a decoder extracts additional information included in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model, An information processing method, wherein a change in the sound source signal and the mixed sound signal due to the addition of the additional information is rendered imperceptible.
- a decoder extracts additional information included in a mixed sound signal obtained by mixing a plurality of sound source signals by applying a predetermined learning model, A program for causing a computer to execute an information processing method in which changes in the sound source signal and the mixed sound signal due to the addition of the additional information are rendered imperceptible.
- a concealer that includes additional information by applying a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal obtained by mixing the plurality of sound source signals; An information processing apparatus in which a change in the sound source signal and the mixed sound signal due to the addition of the additional information is rendered imperceptible.
- the learning model includes an error function between sound source signals before and after including additional information in the sound source signal, a difference between a signal including additional information in the sound source signal and a signal corresponding to the sound source signal obtained by sound source separation. , the error function between additional information before and after being included in the sound source signal, and the error between the additional information before being included in the sound source signal and the additional information included in the sound source signal obtained by sound source separation.
- the information processing device according to any one of (11) to (14), wherein the model is obtained by learning to minimize a loss function based on the function.
- the information processing apparatus according to any one of (11) to (16), wherein the additional information is digital watermark information.
- a concealer applies a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal in which the plurality of sound source signals are mixed to perform processing to include additional information, An information processing method, wherein a change in the sound source signal and the mixed sound signal due to the addition of the additional information is rendered imperceptible.
- a concealer applies a predetermined learning model to at least one of a plurality of sound source signals and a mixed sound signal in which the plurality of sound source signals are mixed to perform processing to include additional information, A program for causing a computer to execute an information processing method in which changes in the sound source signal and the mixed sound signal due to the addition of the additional information are rendered imperceptible.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Molecular Biology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出するデコーダを有し、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理装置である。
デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理方法である。
デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラムである。
複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませるコンシーラを有し、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理装置である。
コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理方法である。
コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
付加情報の付加による音源信号および混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラムである。
<一実施形態>
<変形例>
以下に説明する実施形態等は本開示の好適な具体例であり、本開示の内容がこれらの実施形態等に限定されるものではない。
[再生システムの構成例]
図1は、一実施形態に係る再生システム(再生システム1)の構成例を示す。再生システム1は、ネットワークNWを介して接続される、配信装置2および再生装置3を有する。配信装置2および再生装置3は、複数あってもよい。配信装置2および再生装置3が情報処理装置の一例に対応している。ネットワークNWは、インターネットが代表的な例であるが、LAN(Local Area Network)、Bluetooth(登録商標)、Wi-Fi(登録商標)等何でもよい。また、配信装置2と再生装置3とが有線により接続されていてもよい。本実施形態では、再生装置3をスマートフォンとして説明するが、再生装置3は、パーソナルコンピュータや、ヘッドホンやイヤホン等のヒアラブル機器や他のウエアラブル機器等、何でもよい。
図2は、一実施形態に係る再生装置3の構成例を示すブロック図である。再生装置3は、制御部301と、マイクロフォン302Aと、マイクロフォン302Aに接続される音声信号処理部302Bと、カメラユニット303Aと、カメラユニット303Aに接続されるカメラ信号処理部303Bと、ネットワークユニット304Aと、ネットワークユニット304Aに接続されるネットワーク信号処理部304Bと、スピーカ305Aと、スピーカ305Aに接続される音声再生部305Bと、ディスプレイ306Aと、ディスプレイ306Aに接続される画面表示部306Bとを有している。音声信号処理部302B、カメラ信号処理部303B、ネットワーク信号処理部304B、音声再生部305B、および、画面表示部306Bのそれぞれは、制御部301に対して接続されている。
(第1の例)
次に、図3から図5を参照しつつ、一実施形態のシステム構成例について説明する。始めに、図3を参照しつつ第1の例について説明する。
次に、図4を参照しつつ、第2の例について説明する。第2の例では、コンシーラ41が混合音信号Xに対して電子透かし情報WIを含ませる。混合音信号Xを構成する音源信号としては、第1の例と同じ音源信号(音源信号XA~XC)を適用することができる。
次に、図5を参照しつつ、第3の例について説明する。第3の例は、混合音信号Xがドラムの音源信号XAとピアノの音源信号XBにより構成される例であり、各音源信号に電子透かし情報が埋め込まれている例である。また、配信装置2は、各音源信号に対応するコンシーラ41A、41Bを有している。コンシーラ41Aは、学習モデル100を適用することにより、音源信号XAに電子透かし情報WIaを埋め込む。また、コンシーラ41Bは、学習モデル100を適用することにより、音源信号XBに電子透かし情報WIbを埋め込む。本例のように、コンシーラは、並列的に処理を行う複数のコンシーラを含む構成もあり得る。1個のコンシーラが、電子透かし情報WIa、WIbを埋め込む処理を順次、行うようにしてもよい。電子透かし情報が埋め込まれた2つの信号が加算器62で加算されることで、本例における混合音信号Xが生成される。
次に、図6を参照しつつ、本実施形態に係るコンシーラおよびデコーダで適用され得る学習モデルについて説明する。図6は、本実施形態で行われる処理を一般化した図である。なお、図6では、処理を一般化して示しているため、本実施形態で行われ得る全ての処理を含むような図示がなされているが、図6に示される処理の全てが必ず行われる必要はない。
は、コンシーラを示し
は、デコーダを示す。
また、
は、音源信号を示す。但し、この音源信号は、1つの音源信号でもよいし、複数の音源信号の混合音信号でもよい。また、
は、電子透かし情報を示す。
は、電子透かし情報が埋め込まれた音源信号を示す。
は、各音源信号を加算した信号、すなわち、混合音信号を示す。
は、音源分離処理により得られる分離信号を示し、
は、
を、加算や音源分離することなくデコーダがデコードすることで得られる電子透かし情報を示し、
は、
をデコーダがデコードすることで得られる電子透かし情報を示す。
と表すことができ、
音源分離を行った結果である分離信号に対してデコードすることで得られる電子透かし情報は、
と表すことができる。
式(1)
次に、本実施形態で得られる効果について、図7に示す実験結果を参照して説明する。
なお、実験は下記の条件で行った。
電子透かし情報:16の文字列+終わりを示すエンドトークン
文字種:27種類(26のアルファベット、1エンドトークン)
データセット:4つの音源信号(ボーカル、ドラム、バス、それ以外)のデータセット
音源分離:DNNベースの音源分離モデル
電子透かし情報を埋め込む音源信号:ドラムの音源信号
パターン1(Random):正解と思われる文字をランダムに当てはめるパターン(確率1/27)
パターン2(Baseline(original)):音源分離処理を考慮しない損失関数(式(1)の右辺における1項目および3項目のみの損失関数)で学習した結果得られる学習モデルを、コンシーラおよびデコーダに適用したパターン。
パターン3(STSS(separation)):音源分離処理のみを考慮した損失関数(式(1)の右辺における2項目および4項目のみの損失関数)で学習した結果得られる学習モデルをコンシーラおよびデコーダに適用したパターン。
パターン4(STSS(separation+original)):式(1)で示す損失関数で学習した結果得られる学習モデルをコンシーラおよびデコーダに適用したパターン。
パターン2では、SDRおよびCERの「original」の結果が良好であった。しかしながら、音源分離処理を考慮していない損失関数を用いて得られる学習モデルであるためSDRおよびCERの「separation」の結果がそれぞれ30.8%、96.3%と悪化した。
パターン3では、音源分離処理を考慮した損失関数を用いて得られる学習モデルであるためSDRおよびCERの「separation」の結果が良好であった。しかしながら、音源分離処理をせずにデコードする処理を考慮していない損失関数を用いて得られる学習モデルであるため、SDRおよびCERの「original」の結果がそれぞれ34.6%、38.5%と悪化した。
パターン4では、SDRの「original」、「separation」の結果がそれぞれ、35.2%、37.9%と良好であった。一般にSDRが30%を超えると、電子透かし情報の埋め込みによる音源信号や混合音信号の変化が知覚不可能となる。本実験により、学習モデル100を用いた処理によって係る基準を満たすことが確認された。さらに、CERの「original」、「separation」の結果が共に0.0%と良好であった。以上から、パターン4、すなわち、本実施形態に係る学習モデル100を用いたコンシーラ、デコーダによる処理結果の精度が最も高いことが確認された。
図8Aおよび図8Bは、本実施形態に適用可能なUI(User Interface)の一例を示す。図8Aおよび図8Bに示すUIは、例えば、表示デバイスに表示される。図8Aは主に配信側で利用されるUIであり、図8Bは共に再生側で利用されるUIである。
以上、本開示の一実施形態について説明したが、本開示は、上述した実施形態に限定されることはなく、本開示の趣旨を逸脱しない範囲で種々の変形が可能である。
式(2)
式(2)が式(1)と異なる点は、式(2)の右辺における第5項が追加されている点である。
但し、
であり、
はデコーダ、
は音源iの音源分離機、
は、音源分離結果により得られる音源信号に対する他の音源信号をそれぞれ示す。
また、λは重み係数であり、実験的に決定される値である。
なお、式(2)における右辺の項を追加することにより、2回ではなく3回以上の音源分離や混合に対応する損失関数とすることができる。
(1)
複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出するデコーダを有し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理装置。
(2)
前記付加情報は、前記複数の音源信号が混合された混合音信号に対して加算された情報である
(1)に記載の情報処理装置。
(3)
前記デコーダは、前記混合音信号に対して音源分離処理がなされた分離信号から前記付加情報を抽出する
(1)に記載の情報処理装置。
(4)
前記混合音信号を構成する少なくとも一の音源信号に、前記付加情報が含まれる
(1)に記載の情報処理装置。
(5)
前記混合音信号を構成する各音源信号に、前記付加情報が含まれる
(4)に記載の情報処理装置。
(6)
前記音源分離処理を行う音源分離部を有する
(3)に記載の情報処理装置。
(7)
前記学習モデルは、前記音源信号に付加情報を含ませる前後における音源信号間の誤差関数、前記音源信号に付加情報を含ませた信号と音源分離により得られる当該音源信号に対応する信号との間の誤差関数、前記音源信号に含ませる前後における付加情報間の誤差関数、および、前記音源信号に含ませる前の付加情報と音源分離により得られる当該音源信号に含まれる付加情報との間の誤差関数に基づく損失関数を最小化する学習を行うことにより得られるモデルである
(1)から(7)までの何れかに記載の情報処理装置。
(8)
前記付加情報は、電子透かし情報である
(1)から(7)までの何れかに記載の情報処理装置。
(9)
デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法。
(10)
デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラム。
(11)
複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませるコンシーラを有し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理装置。
(12)
前記コンシーラは、前記複数の音源信号が混合された混合音信号に対して前記付加情報を含ませる
(11)に記載の情報処理装置。
(13)
前記コンシーラは、前記複数の音源信号の少なくとも一の音源信号に対して前記付加情報を含ませる
(11)に記載の情報処理装置。
(14)
前記コンシーラは、前記複数の音源信号の全ての音源信号に対して前記付加情報を含ませる
(13)に記載の情報処理装置。
(15)
前記付加情報を含ませる際に適用する前記学習モデルが選択可能とされる
(11)から(14)までの何れかに記載の情報処理装置。
(16)
前記学習モデルは、前記音源信号に付加情報を含ませる前後における音源信号間の誤差関数、前記音源信号に付加情報を含ませた信号と音源分離により得られる当該音源信号に対応する信号との間の誤差関数、前記音源信号に含ませる前後における付加情報間の誤差関数、および、前記音源信号に含ませる前の付加情報と音源分離により得られる当該音源信号に含まれる付加情報との間の誤差関数に基づく損失関数を最小化する学習を行うことにより得られるモデルである
(11)から(14)までの何れかに記載の情報処理装置。
(17)
前記付加情報は、電子透かし情報である
(11)から(16)までの何れかに記載の情報処理装置。
(18)
コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法。
(19)
コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラム。
3・・・再生装置
41・・・コンシーラ
51、51A、51B、51C・・・デコーダ
100・・・学習モデル
305C・・・音源分離部
305D・・・デコーダ
WI・・・電子透かし情報
Claims (19)
- 複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出するデコーダを有し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理装置。 - 前記付加情報は、前記複数の音源信号が混合された混合音信号に対して加算された情報である
請求項1に記載の情報処理装置。 - 前記デコーダは、前記混合音信号に対して音源分離処理がなされた分離信号から前記付加情報を抽出する
請求項1に記載の情報処理装置。 - 前記混合音信号を構成する少なくとも一の音源信号に、前記付加情報が含まれる
請求項1に記載の情報処理装置。 - 前記混合音信号を構成する各音源信号に、前記付加情報が含まれる
請求項4に記載の情報処理装置。 - 前記音源分離処理を行う音源分離部を有する
請求項3に記載の情報処理装置。 - 前記学習モデルは、前記音源信号に付加情報を含ませる前後における音源信号間の誤差関数、前記音源信号に付加情報を含ませた信号と音源分離により得られる当該音源信号に対応する信号との間の誤差関数、前記音源信号に含ませる前後における付加情報間の誤差関数、および、前記音源信号に含ませる前の付加情報と音源分離により得られる当該音源信号に含まれる付加情報との間の誤差関数に基づく損失関数を最小化する学習を行うことにより得られるモデルである
請求項1に記載の情報処理装置。 - 前記付加情報は、電子透かし情報である
請求項1に記載の情報処理装置。 - デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法。 - デコーダが、複数の音源信号が混合された混合音信号に含まれる付加情報を、所定の学習モデルを適用することで抽出し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラム。 - 複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませるコンシーラを有し、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理装置。 - 前記コンシーラは、前記複数の音源信号が混合された混合音信号に対して前記付加情報を含ませる
請求項11に記載の情報処理装置。 - 前記コンシーラは、前記複数の音源信号の少なくとも一の音源信号に対して前記付加情報を含ませる
請求項11に記載の情報処理装置。 - 前記コンシーラは、前記複数の音源信号の全ての音源信号に対して前記付加情報を含ませる
請求項13に記載の情報処理装置。 - 前記付加情報を含ませる際に適用する前記学習モデルが選択可能とされる
請求項11に記載の情報処理装置。 - 前記学習モデルは、前記音源信号に付加情報を含ませる前後における音源信号間の誤差関数、前記音源信号に付加情報を含ませた信号と音源分離により得られる当該音源信号に対応する信号との間の誤差関数、前記音源信号に含ませる前後における付加情報間の誤差関数、および、前記音源信号に含ませる前の付加情報と音源分離により得られる当該音源信号に含まれる付加情報との間の誤差関数に基づく損失関数を最小化する学習を行うことにより得られるモデルである
請求項11に記載の情報処理装置。 - 前記付加情報は、電子透かし情報である
請求項11に記載の情報処理装置。 - コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法。 - コンシーラが、複数の音源信号、および、当該複数の音源信号が混合された混合音信号の少なくとも一方に対して、所定の学習モデルを適用することで付加情報を含ませる処理を行い、
前記付加情報の付加による前記音源信号および前記混合音信号の変化が知覚不可能とされる
情報処理方法をコンピュータに実行させるプログラム。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202280063574.6A CN117980990A (zh) | 2021-09-28 | 2022-02-16 | 信息处理装置、信息处理方法和程序 |
EP22875364.6A EP4411733A1 (en) | 2021-09-28 | 2022-02-16 | Information processing device, information processing method, and program |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2021-157816 | 2021-09-28 | ||
JP2021157816 | 2021-09-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023053480A1 true WO2023053480A1 (ja) | 2023-04-06 |
Family
ID=85782154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/006048 WO2023053480A1 (ja) | 2021-09-28 | 2022-02-16 | 情報処理装置、情報処理方法およびプログラム |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP4411733A1 (ja) |
CN (1) | CN117980990A (ja) |
WO (1) | WO2023053480A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004500728A (ja) * | 1998-06-05 | 2004-01-08 | 日本電気株式会社 | データ調製及び電子透かし挿入方法 |
WO2018047643A1 (ja) | 2016-09-09 | 2018-03-15 | ソニー株式会社 | 音源分離装置および方法、並びにプログラム |
-
2022
- 2022-02-16 CN CN202280063574.6A patent/CN117980990A/zh active Pending
- 2022-02-16 EP EP22875364.6A patent/EP4411733A1/en active Pending
- 2022-02-16 WO PCT/JP2022/006048 patent/WO2023053480A1/ja active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2004500728A (ja) * | 1998-06-05 | 2004-01-08 | 日本電気株式会社 | データ調製及び電子透かし挿入方法 |
WO2018047643A1 (ja) | 2016-09-09 | 2018-03-15 | ソニー株式会社 | 音源分離装置および方法、並びにプログラム |
Also Published As
Publication number | Publication date |
---|---|
EP4411733A1 (en) | 2024-08-07 |
CN117980990A (zh) | 2024-05-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7900256B2 (en) | Communication apparatus, communication method, and recording medium used therewith | |
US11170793B2 (en) | Secure audio watermarking based on neural networks | |
WO2006077061A1 (en) | Method of embedding a digital watermark in a useful signal | |
KR20090000898A (ko) | 저작권이 적용된 사용자 손수 저작물의 생성과 운용을 위한방법 및 장치 | |
MXPA02006205A (es) | Metodo, aparato y colocacion para insertar informacion extra. | |
JP2003513362A (ja) | 電子メディア安全配布用能動型データ隠蔽 | |
US8644501B2 (en) | Paired carrier and pivot steganographic objects for stateful data layering | |
WO2009023086A3 (en) | Stochastic halftone images based on screening parameters | |
CN108366299A (zh) | 一种媒体播放方法以及装置 | |
WO2020066399A1 (ja) | 情報処理装置、情報処理方法、およびプログラム | |
WO2023053480A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
Fadhil et al. | Improved Security of a Deep Learning-Based Steganography System with Imperceptibility Preservation | |
Ogundokun et al. | Modified least significant bit technique for securing medical images | |
Zamani et al. | An artificial-intelligence-based approach for audio steganography | |
Zamani et al. | A novel approach for audio watermarking | |
WO2021035978A1 (zh) | 信息隐写方法、装置、设备及存储介质 | |
US7231271B2 (en) | Steganographic method for covert audio communications | |
Khan et al. | Implementation of variable least significant bits stegnography using dddb algorithm | |
WO2022236451A1 (en) | Robust authentication of digital audio | |
Nair et al. | Audio watermarking in wavelet domain using Fibonacci numbers | |
JP3770732B2 (ja) | 画像への情報添付方法および画像からの情報抽出方法 | |
Heylen et al. | An image watermark tutorial tool using Matlab | |
CN110619883A (zh) | 音乐的信息嵌入方法、提取方法、装置、终端及存储介质 | |
WO2023047620A1 (ja) | 情報処理装置、情報処理方法およびプログラム | |
US20240371390A1 (en) | Information processing device, information processing method, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22875364 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18691047 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280063574.6 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202427025256 Country of ref document: IN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022875364 Country of ref document: EP Effective date: 20240429 |