WO2001043422A1 - Information processing method and recorded medium - Google Patents

Information processing method and recorded medium Download PDF

Info

Publication number
WO2001043422A1
WO2001043422A1 PCT/JP1999/006838 JP9906838W WO0143422A1 WO 2001043422 A1 WO2001043422 A1 WO 2001043422A1 JP 9906838 W JP9906838 W JP 9906838W WO 0143422 A1 WO0143422 A1 WO 0143422A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
digital
embedding
image information
audio
Prior art date
Application number
PCT/JP1999/006838
Other languages
French (fr)
Japanese (ja)
Inventor
Masahiko Katsuragi
Yasuhiro Akiyama
Kenta Morishima
Original Assignee
Hitachi,Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi,Ltd filed Critical Hitachi,Ltd
Priority to PCT/JP1999/006838 priority Critical patent/WO2001043422A1/en
Publication of WO2001043422A1 publication Critical patent/WO2001043422A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3225Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document
    • H04N2201/3226Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document of identification information or the like, e.g. ID code, index, title, part of an image, reduced-size image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3261Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of multimedia information, e.g. a sound signal
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3269Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs
    • H04N2201/327Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs which are undetectable to the naked eye, e.g. embedded codes

Definitions

  • the present invention relates to a technology for embedding a digital watermark in original image information for clarifying a copyright or the like of original image information created by a digital information content provider or the like, and to information processing for embedding a digital watermark.
  • the present invention relates to a method, an information processing device, and a technology effective when applied to an information processing system for electronic approval or electronic transaction using information in which an electronic watermark is embedded. Background art
  • Digital watermarking technology is a technology that embeds and hides some information in multimedia data such as images and sounds. If the invisible digital watermark information is embedded and continues to remain in some places such as an electronic work, the effect of suppressing unauthorized copying of the work can be expected.
  • the ID (Identification) number encoded for each individual information copyright holder and each company that has the distribution right of the original information is encrypted.
  • ID information that is one-to-one corresponding to information such as image data, audio data, and text is encrypted and embedded in the original information.
  • the formula was embedded as image or text information.
  • the key number is used to detect the ID number at that time.
  • the key information is provided with information that can identify the location where the ID number is embedded.
  • the digital watermarking technology As a technology in which the digital watermarking technology is applied from a viewpoint different from the suppression of the illegal copying, there is a video device described in Japanese Patent Application Laid-Open No. H10-290404 (the video device described in It is designed to embed subtitles and audio digital signals in multiple languages into digital video signals using digital watermarking technology, so that multiple languages can be used without changing the recording format. This makes it possible to record textual information and audio information, and at the time of playback, it is possible to select and reproduce any subtitles or audio digital signals extracted from the video signal. Is to efficiently embed subtitles or audio information related to video information using digital watermarking technology. And the audio information must be digital watermarks that can clearly restore and maintain the quality of the reproduced information, such as the voices and sounds of each performer can be distinguished from each other. This is completely different from digital watermarking technology as an ID for protection.
  • Japanese Patent Application Laid-Open No. 11-166686 discloses an image information processing technique using information such as images, sounds, and text as embedded information.
  • the embedded information is information to be superimposed on the original information as visible information, and is a technique for embedding information accompanying the original information. Therefore, the embedded information is one type of multimedia information that is added to the original information, and the embedded information is information that should be actively exposed to users. It is completely different from information embedding technology.
  • the technology When applied to children's picture books, when a landscape image of a zoo is used as the embedded image information, for example, a lion's cry is embedded in a lion's picture, and information about the ecology of a penguin is embedded in a penguin's picture. By associating the embedded image information with the embedded image information relatively easily associated with each other, it becomes easy to search for the embedded information desired by the user from a plurality of embedded information.
  • digital watermark information used as an ID is an image, text, or information obtained by encrypting them.
  • the image information in which the electronic watermark information having the meaning as the ID information is embedded is, for example, noise addition during transmission during transmission and reception of information in a network business, etc.
  • Various disturbances occur in embedded watermark information such as information loss during transmission, encryption processing, code compression processing, decryption processing, and code decompression processing. Therefore, digital watermark information must be resistant to such disturbances. This is because the purpose is to protect copyright, and when watermark information is extracted, it is necessary that the meaning of the ID information can be finally confirmed accurately.
  • An object of the present invention is to reduce the amount of embedding information relative to original image information as compared with the conventional invisible electronic embedding technology, An information processing method that realizes an electronic watermark as ID information that is resistant to disturbance and information loss, a recording medium and a transmission medium for a program that contributes to the realization of the information processing method, and an information processing method It is to provide an apparatus and an information processing system.
  • Another object of the present invention is to provide an information processing method capable of relatively easily recognizing ID information by a digital watermark even in a low-cost system, a recording medium of a program contributing to the realization of the information processing method, and transmission. It is to provide a medium, and an information processing device and an information processing system.
  • Still another object of the present invention is to provide an information processing apparatus that has a small amount of embedded information relative to original image information, and is strong against disturbance to the embedded original information and against information loss.
  • An object of the present invention is to provide a recording medium storing a program capable of easily realizing embedding of ID information by a digital watermark and reproduction of the embedded ID information.
  • the information processing method from the viewpoint of embedding digital watermark information uses the ID information as the ID information in the digitized original image information, that is, in the sample value area (sample value area) of the digitized original image information. It is characterized by embedding separable digital voice ID information as invisible digital watermark information.
  • An information processing method includes an embedding method for determining a plurality of embedding positions for embedding digital watermark information in digital original image information by referring to the original digital image information.
  • a position determination process, an order determination process for determining an embedding order for the determined embedding position, and an ID information is stored in the embedding position in accordance with the embedding order.
  • embedding digital voice ID information having meaning as information as invisible digital watermark information.
  • the digital audio ID information as invisible digital watermark information generally has a larger number of quantization bits, a smaller sampling period, and a larger number of audio data, the larger the audio ID watermark. Strength can be increased.
  • the voice ID it is only necessary that the meaning of the voice when the watermark is finally extracted can be recognized (meaning understanding) by a person (checker) or by a machine, and the voice quality (sound quality, noise level, gender identification) , Individuality identification, etc.) are good enough.
  • the voice quality sound quality, noise level, gender identification
  • the amount of embedded information relative to the original image information is small, and it is possible to realize a digital watermark as ID information that is strong against disturbance to the embedded original image information and also against information loss.
  • a digital watermark as ID information that is strong against disturbance to the embedded original image information and also against information loss.
  • the audio parameters such as the number of quantization bits and the sampling frequency
  • the information processing method which focuses on the frequency conversion value region (frequency domain) of the digitized original image information is performed by a frequency conversion process of the digitized original image information and a predetermined frequency obtained by the conversion process.
  • An information processing method according to a more detailed aspect of this information processing method includes a conversion process of performing frequency conversion on digital original image information and a conversion process obtained by the conversion process.
  • An embedding position determining process for determining, as an embedding position, a plurality of embedding frequency components for embedding digital watermark information from each frequency component, and an order determining process for determining an embedding order for the determined plurality of embedding positions.
  • a process of embedding digital audio ID information having meaning as ID information at the embedding position as invisible digital watermark information in accordance with the embedding order; and performing an inverse process for the frequency conversion on the information having passed the embedding process. And inverse conversion processing for performing the conversion.
  • Embedding audio ID information in the frequency domain slightly increases the amount of information processing compared to embedding in the sample value domain, but can easily increase the watermark strength.
  • the digital voice ID information includes, for example, one of the number of quantization bits selected from 3 bits to 16 bits and a voice parameter of a sampling frequency selected from a range of 3 to 40 kHz. Even relatively low-accuracy speech information, such as that specified by an overnight, is sufficient for ID recognition.
  • the digitized voice ID information is quantized and formed by a 1-bit oversampling method, and is voice information specified by voice parameters having a sampling frequency of 3 to 40 kHz. There is no inconvenience. Since the digital audio ID information is not limited to embedding along the position pattern of the original image like embedding the image ID, It is easy to embed multiple.
  • the embedding position is determined depending on the position pattern of the image such that the image ID information is embedded as noise in a so-called redundant portion that is not important for human perception.
  • the amount of information is small and inherently invisible information, so it is easy to embed multiple times. At this time, if the embedding position of the current voice ID information is determined so that the center of gravity differs from the centroid of a plurality of embedding positions that have already been embedded, reproduction of the ID information becomes easy.
  • the information processing method from the viewpoint of the reproduction of digital watermark information is based on the case where digital audio ID information is embedded in the sample value area of the original digital image information, as ID information in the original digital image information. Separation processing for separating the digital audio ID information from the information embedded with the separable digital audio ID information as invisible digital watermark information, and the separated digital audio ID information And a sound reproduction process for reproducing the sound.
  • the audio playback is performed using the audio configuration parameters for the separated and reconstructed digital audio ID information, and the data order during reproduction follows the embedding order of the digital audio ID information.
  • the information processing method for reproduction is based on the fact that an ID is assigned to a predetermined frequency component of the digitized original image information.
  • the invention according to another aspect provides a program for implementing the information processing method on an information processing device such as a computer device.
  • the recording medium of the program is a medium such as a CD-OM for statically recording the program.
  • the transmission medium is a communication medium that dynamically transmits the program for electronically, electromagnetically, or optically distributing or distributing the program via a network connected by a wired line or a wireless line (
  • the recording medium of the program uses the process of embedding the digital audio ID information, which has meaning as ID information, in the original digital image information in a separable manner as invisible digital watermark information.
  • a process for separating the digitized voice ID information from the embedded information described above, and a program for causing the information processing device to execute the program is recorded in the information processing device in a readable manner.
  • the recording medium includes a first conversion process for frequency-converting the original digital image information and a predetermined frequency component obtained in the first conversion process as ID information.
  • Embedding processing for embedding meaningful digital audio ID information as invisible digital watermark information; inverting processing for performing inverse conversion on the frequency conversion of the information that has undergone the embedding processing;
  • a program for causing an information processing device to execute a second conversion process for converting to a frequency component and a separation process for separating the digital audio ID information from the information obtained in the second conversion process is executed by the information processing device. It is recorded so that it can be read.
  • the transmission medium of the program When embedding the ID information in the sample value area, the transmission medium of the program embeds the digitized audio ID information having meaning as the ID information in the original digitized image information so as to be separable as invisible digital watermark information.
  • the transmission medium of the program When embedding the ID information in the sample value area, the transmission medium of the program embeds the digitized audio ID information having meaning as the ID information in the original digitized image information so as to be separable as invisible digital watermark information.
  • the transmission medium, the first conversion processing function for frequency conversion of the original image information, and the predetermined frequency component obtained by the first conversion processing function have the ID
  • a program for realizing the processing device is transmitted.
  • the program may include a description of a processing procedure for controlling or realizing the processing of reproducing the separated digital audio ID information.
  • the amount of embedded information relative to the original image information is small, and it is strong against disturbance to the embedded original information and also against information loss.
  • the information processing device can easily realize the embedding of the ID information by the electronic watermark and the separation of the embedded ID information.
  • a recording medium records information in which digital audio ID information is embedded as digital watermark information in original image information.
  • a recording medium is a recording medium in which image information is recorded in an information processing apparatus in a readable manner, and the image information is a digital audio ID information having a meaning as an ID, and invisible digital watermark information.
  • the digital sound embedded and embedded in the sample value area of the original digital image information or in the frequency-converted frequency area of the original digital image information as Voice ID information is information that can be separated from the sample value area or the frequency domain by the information processing device.
  • the image information is information in which digital audio ID information having meaning as ID information is separably embedded as invisible digital watermark information in original digital image information.
  • digital audio ID information having a meaning as ID information is embedded as invisible electronic watermark information in a predetermined frequency component of the original digital image information, and the digital file is transmitted through frequency conversion.
  • the voice ID information is separable information.
  • a transmission medium transmits the image information in which the digital audio ID information is separably embedded in the original digital image information to an information processing apparatus.
  • An information processing device such as a convenience store device for inputting the image information recorded on the recording medium or the image information transmitted via the transmission medium may be provided with voice ID information as digital watermark information as necessary. This makes it possible to relatively easily realize a process of separating the contents and confirming the contents of the information processing device even with a simple cost.
  • the present invention from the viewpoint of an information processing system that processes or uses information in which audio ID information is embedded in original image information takes into account an information processing system for electronic approval, electronic commerce, information distribution, and the like.
  • the information processing system may use digital voice ID information having meaning as ID in the sample value area of the original digital image information or as the invisible electronic watermark information.
  • a first information processing apparatus for outputting image information embedded in a frequency domain subjected to frequency conversion so as to be separable, and a second information processing apparatus for inputting the image information and using the input image information as electronic approval information And an information processing device.
  • the information processing system converts the digitized voice ID information, which has meaning as ID, into the sample value area of the digitized original image information as invisible electronic watermark information.
  • a first information processing apparatus that outputs image information embedded in a frequency domain obtained by frequency-converting the digital original image information in a separable manner, and the image information that is input after inputting the image information, And a second information processing device that uses the information as electronic signature information.
  • the information processing system uses the digital audio ID information having a meaning as an ID as the invisible digital watermark information in the sample value area of the original digital image information or the digital original image information. It includes an information processing device that stores image information embedded in a frequency domain whose frequency has been converted so as to be separable, and outputs the stored image information in response to a distribution request.
  • information obtained by embedding the voice ID information in the original image information is processed or used in electronic approval, electronic commerce, information distribution, and the like. You can expect a function to prevent unauthorized copying.
  • the information processing apparatus includes an input unit and an arithmetic control unit, and the input unit includes a digital source image.
  • Information is input, and the arithmetic control means can embed separable digital audio ID information having meaning as ID information into the input digital original image information as invisible digital watermark information.
  • an information processing apparatus includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes a digital storage unit stored in the storage unit.
  • Frequency conversion to original image information And embeds digital audio ID information having meaning as ID information as invisible digital watermark information in a predetermined frequency component obtained by the frequency conversion, and applies the information after the embedding to the frequency conversion. It is possible to perform an inverse transformation.
  • An information processing apparatus includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes digital original image information stored in the storage unit. , Determining a plurality of embedding positions for embedding digital watermark information in the digital original image information, determining an embedding order for the determined embedding position, and as the ID information at the embedding position according to the embedding order. It is possible to embed the digital voice ID information with the meaning of as invisible digital watermark information.
  • An information processing apparatus includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes the storage unit.
  • the original digital image information stored in the means is frequency-converted, and a plurality of embedding frequency components for embedding digital watermark information from among the frequency components obtained by the frequency conversion are determined as embedding positions.
  • the embedding order for a plurality of embedding positions is determined, and digitized voice ID information having meaning as ID information is embedded as invisible electronic watermark information in the embedding position according to the embedding order, and the embedding process is performed. It is possible to perform an inverse transformation to the frequency transformation on the information that has passed through.
  • the operation control means determines the embedding position when the digital audio ID information is already embedded as an invisible digital watermark. Has a center of gravity that is different from the center of gravity of multiple embedding positions that have already been embedded. Thus, the embedding position of the current voice ID information may be determined.
  • the audio parameters of the audio ID information may be calculated from the state of the original image information or selected from a table.
  • the information processing apparatus adopting the latter includes an input unit, a storage unit, and an arithmetic control unit, and the storage unit converts the frequency of the sample value area of the original image information or the original image information.
  • the embedding control information for embedding the invisible digital watermark information in the frequency domain and the order of the positions are predetermined, and the voice configuration parameters of the digital audio ID information having meaning as the digital watermark information are set in advance.
  • the determined voice parameter overnight information is stored.
  • the input means inputs digital original image information.
  • the arithmetic control means selects required voice parameters and embedding control information from the storage means, and selects the selected audio parameters in a sample value area of digital original image information or a frequency-converted frequency area of digital original image information. And embedding the digitized voice ID information according to the selected voice parameter in the position and order specified by the selected embedding control information.
  • the information processing device includes an input unit, a storage unit, and an arithmetic control unit, and the storage unit stores a sample value region of the digitized original image information or a frequency-converted frequency region of the digitized original image information.
  • the embedding control information for embedding the invisible digital watermark information in the video and the order of the positions are encrypted in advance and the embedding control information, and the voice configuration parameters of the digital voice ID information meaningful as the digital watermark information are specified in advance.
  • the input means inputs digital original image information.
  • the arithmetic control means selects required encrypted voice parameters and encrypted embedded control information from the storage means, decodes the selected encrypted voice parameters and encrypted embedded control information, and decodes the digitized original image information.
  • Sample value area or data Embedding digital audio ID information in accordance with the decoded audio parameter information in the frequency domain of the digital original image information in which the frequency has been converted, in the position and order specified by the decoded embedding control information It is.
  • An information processing apparatus from the viewpoint of reproducing ID information includes input means and arithmetic control means, and the input means includes a digital audio ID having a meaning as ID information in the original digital image information.
  • Information in which information is separably embedded as invisible digital watermark information is input, and the arithmetic control means can separate the digital audio ID information from the input information.
  • An information processing apparatus includes input means and arithmetic control means, and the input means has a predetermined frequency component of digital original image information having a meaning as ID information. Enter the information in which the digital voice ID information is embedded as invisible digital watermark information.
  • the arithmetic control unit can perform frequency conversion on the input information, and can separate the digitized audio ID information from the information obtained by the frequency conversion.
  • the arithmetic control means may further perform a noise removal filtering process on the separated digitized audio ID information.
  • the voice ringing process makes it easier to recognize the voice ID.
  • the arithmetic control unit may further perform a speech recognition process on the digital audio data D that has been subjected to the noise removal filtering process. Automatic recognition of voice ID becomes possible.
  • FIG. 1 is an explanatory diagram showing an example of a method of embedding digital audio ID information as a digital watermark in a sample value area of digital original image information.
  • FIG. 2 is an explanatory diagram showing an example of a method for embedding digital audio ID information as a digital watermark in the frequency domain of digital original image information.
  • FIG. 3 is an explanatory view exemplifying a rule that specifies the order of embedding digital voice ID information.
  • FIG. 4 is a flowchart showing in detail a processing procedure for embedding the audio ID information by the method of FIG.
  • FIG. 5 is an explanatory diagram illustrating a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
  • FIG. 6 is an explanatory diagram exemplifying a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
  • FIG. 7 is a flowchart showing in detail the processing procedure for separating audio ID information and original image information by the method of FIG.
  • FIG. 8 is a block diagram showing a basic configuration example of an information processing system assuming a case where a digital watermark using audio ID information is applied to electronic approval and electronic commerce.
  • FIG. 9 is an explanatory diagram showing in detail the flow of processing by the information processing apparatus in FIG.
  • FIG. 10 is an explanatory diagram showing emphasizing an embedding place when a digital watermark based on audio ID information is multiplexed into image information.
  • FIG. 11 is an explanatory diagram showing an example of embedding an electronic watermark by voice ID information in an electronic approval document and electronic identification documents.
  • FIG. 12 is a block diagram illustrating an outline of an electronic commerce or electronic approval system using voice ID information.
  • FIG. 13 is a block diagram showing an example of an information processing device that uses voice ID information for digital watermarking, and an example of an information processing network including the information processing device.
  • FIG. 1 shows an example of a method of embedding a digital audio ID as a digital watermark in a sample value area of digital original image information.
  • FIG. 2 shows an example of a method for embedding digital audio ID information as a digital watermark in the frequency domain of digital original image information.
  • the digital audio ID information 3 is embedded in the original image information as a digital watermark
  • the digital audio ID information 3 is embedded in the sample value area, which is the raw original image information 1 itself, as shown in Fig. 1, and as shown in Fig. 2.
  • the raw original image information 1 may be embedded in a frequency domain obtained by performing a frequency transformation (orthogonal transformation) process 6 such as FFT (Fast Fourier Transform).
  • a frequency transformation (orthogonal transformation) process 6 such as FFT (Fast Fourier Transform).
  • a plurality of types of digital voice ID information 3 having different voice parameters are prepared in advance, and are selected by the selection means 104, and are used for the watermark embedding processes 2 and 7.
  • the digital audio ID information 3 is digital audio information generated by A / D conversion of analog audio for indicating copyright ownership or the like, or digital audio information generated by digital audio synthesis. Information.
  • the information that has undergone the embedding process is subjected to an inverse conversion process 8 with respect to the frequency conversion process.
  • an inverse conversion process 8 with respect to the frequency conversion process.
  • what is indicated by 5 and 9 is information formed by embedding digital audio ID information 3 in the original image information as a digital watermark.
  • the optimal embedding position of the voice ID information can be determined from the original digital image information according to a conventional general digital watermark embedding position determining algorithm. At that time, the original information is kept as small as possible.
  • the masking effect that low-luminance pixels near high-luminance pixels are difficult to see is used, and digital audio Embed ID information.
  • the digital audio ID information is embedded in a portion having many high-frequency components which is difficult to understand even when the watermark information is embedded.
  • the number of quantization bits of voice a (i), sampling period (frequency) ⁇ , voice data — number of evenings (time) ⁇ , etc. need to be determined according to the watermark embedding strength required for audio ID information.
  • the greater the number of quantization bits, the smaller the sampling period, and the greater the number of audio data the stronger the watermark strength of audio ID information can be. In other words, the ability to maintain the original form of information against disturbances such as noise and information loss is high.
  • the digital voice ID information As a feature of the digital voice ID information, when the digital voice ID information as an embedded digital watermark is extracted, the meaning of the voice is recognized by a person (checker) or by a machine (semantic understanding). You can do it. Even if the sound quality such as sound quality, noise level, gender identification, and individuality identification is poor, it is sufficient if the meaning can be understood. For example, as an example of a digital audio configuration parameter, if the number of quantization bits is 4 bits, the sampling frequency is 4 kHz, and the digital audio time is 1 second to 2 seconds, the digital audio It is sufficient that the watermark data amount of the ID information is 500 to 100 points at the sample points. In the examples of Fig. 1 and Fig.
  • the digital voice ID information 3 is prepared in advance, and is selected by the selection means 4 as appropriate. That is, the embedding voice parameters are determined in the watermark embedding processes 2 and 7, and the digital voice ID information is selected by the selection means 4 in accordance with the parameters. All voice parameters of selectable digital voice ID information It need not be selectable. It is only necessary to prepare several types of digital audio ID information in which the number n of audio data and the sampling period At are fixed.
  • the embedding order of the digitized audio ID information is determined in addition to the information embedding position.
  • the embedding order as shown in FIG. 3, a horizontal zigzag (A), a vertical zigzag (B), a horizontal zigzag (C), an irregular random number (D) and the like can be selected.
  • (C) in FIG. 3 is adopted as the data order, the digital original image information is scanned sequentially in the direction of the arrow in FIG.
  • the data of the data sequence that constitutes the digital voice ID information (data block units) is embedded.
  • the information itself obtained from the embedding process 2 becomes new information 5 in which the voice ID information is embedded, and in the case of FIG. 2, the frequency conversion process 8
  • the information after this is the new information 9 in which the voice ID information is embedded.
  • the new information 5 and 9 also include the information of the digital watermark embedding position as key data.
  • the parameters of the voice ID information configuration such as the number of quantization bits and the data of the embedding order may be included together with the key data.
  • the reproduction is performed. It is not necessary to include such information in the new information 5 and 9. This is because it can be newly obtained by calculation again during playback. If such calculation is not possible, the embedding position and parameters of the voice ID information configuration may be included in the above information 5 and 9 as key data, but the external certificate authority or the like stores the key data. And If available, key information need not be included in information 5 or 9 at all. In addition, when the voice configuration parameters, embedding position, and embedding order are fixedly determined according to a certain algorithm, such information need not be included in the key data.
  • FIG. 4 shows a detailed processing procedure for embedding the audio ID information by the method shown in FIG.
  • the processing shown in the figure can be executed by, for example, an input device such as a convenience store main unit, a display, a keyboard, and an information processing device such as a display device having an interface circuit capable of transmitting information to the outside. Processing.
  • digital audio ID information for a watermark to be embedded in the original image information is set (11).
  • the setting of the voice ID information performed here is to create the digital voice ID information primitive. For example, audio information that indicates copyright ownership is generated as analog audio information, and this is converted into digital audio ID information using multiple types of audio parameters.
  • the voice ID information parameters are determined with reference to the original image information, and the corresponding digital voice ID information for embedding is determined (12). For example, depending on the data density and color tone of the original image, how much error can be embedded and how much watermark strength is required In consideration of the above, etc., the required voice composition parameters are determined.
  • the position and the order of embedding the digital audio ID information as the watermark information are determined from the original image information according to a predetermined algorithm such as using the above-mentioned mask effect (13).
  • the process of embedding the digital audio ID information for the watermark to the original image information to the end while changing the position according to the above order is repeated (14, 15) ⁇ The method of Fig. 2 is realized.
  • the original image information is frequency-converted before step 14.
  • inverse conversion of frequency conversion is performed on the information in which the voice ID information is embedded. Their processing is different from that in Fig. 4 ⁇
  • FIG. 5 shows a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
  • the watermark extraction process that separates the digital audio ID information from the original image information from the image information in which the digital audio ID information is embedded 21 is based on the image information 5 from the digital information 5 according to the embedding position and the embedding order when embedding the image.
  • the position where the audio ID information is extracted is obtained, and the original image information is separated from the digital audio ID information.
  • the embedding position and embedding order of the digital voice ID information are, for example, encrypted at the time of watermark embedding and stored as key data, for example, in information 5, so if the key data is taken out and used according to certain rules The removal position and the removal order are required. As a result, the separated digital voice ID information is reconstructed in a reproducible state. The reconstructed digital voice ID information can be played back or recognized in accordance with the voice configuration parameters.
  • the embedding position and the order of embedding are determined using the original image information at the time of embedding, even if the information indicating the embedding position and order is not directly included in the image information 5, the embedding position is determined again from the original image information.
  • the order may be calculated, and the voice ID information may be reconstructed in accordance with the calculated position and order.
  • the number of quantization bits of the sound a (i), the sampling period (frequency) t, the number of sound data (Time) Speech parameters such as n can be obtained from the restoration information after removing the embedding information from the embedding position, according to the same algorithm as that determined at the time of embedding, or a predetermined rule is determined in advance. If so, you just have to follow it.
  • the information indicated by D information and 23 are the restoration information of the separated original image information ( FIG. 6 shows the digital audio data from the image information in which the digital audio ID information is embedded by the method of Fig. 2).
  • a method for separating the ID information from the original image information is shown.
  • the frequency conversion processing 26 equivalent to the frequency conversion processing 6 such as FFT (Fast Fourier Transform) in the embedding processing is performed, and the digital watermark is embedded.
  • a watermark extraction process 24 is performed using the same algorithm as the above-described information 9.
  • the restoration signal 23 here is obtained by removing the embedded information corresponding to the embedding position.
  • the information is obtained by performing an inverse transform 28 on the frequency transform 26.
  • FIG. 7 shows in detail a processing procedure for separating the audio ID information and the original image information and further reproducing the same by the method shown in FIG.
  • the processing shown in the figure can be executed by an input device such as a display unit, a display, a keyboard, and an information processing device such as a display device having an interface circuit capable of transmitting information to the outside. Processing.
  • a predetermined watermark removal is performed.
  • Data indicating the position where the ID voice data for watermarking is embedded is obtained from the key data in accordance with the output processing algorithm (31).
  • the restoration data is temporarily separated from the voice ID data (32).
  • the order of the digital voice ID information is not considered.
  • the voice parameters are determined from the separated restored data (33). It is assumed that the order of the data of the digital voice ID information is determined in advance at the embedding stage. Using the same order as above, refer to the number of quantization bits in the speech configuration parameter and reconstruct the digital speech ID information from the data temporarily separated in step 32 ( 3 4).
  • step 35 When the digital voice ID information is reconstructed and completely separated, noise removal is performed on the separated digital voice ID information (35).
  • voice reproduction processing is performed on the separated digital noise ID information from which noise has been removed, and the meaning of the digital voice ID information is extracted or recognized (36). If the voice ID information is multiplexed and embedded, the process of steps 33 to 36 is performed for all voice ID information via step 37.
  • the frequency of the information 9 is converted before step 32.
  • the information corresponding to the original image information separated in step 32 is subjected to inverse conversion of frequency conversion to obtain restoration information. Their processing is different from that of FIG.
  • Fig. 8 shows a basic configuration example of an information processing system assuming that the above method is applied to businesses such as electronic approval, electronic commerce, and information distribution.
  • reference numerals 40 and 41 denote information processing devices, respectively.
  • the first information processing device 40 separates the digitized audio ID information having meaning as ID information into the frequency domain of the original digitized image information as invisible digital watermark information. It is possible to output image information embedded detachably.
  • the second information processing device 41 inputs the image information output by the first information processing device 40 and uses the input image information as electronic approval information or signature information. . Then, the second information processing apparatus 41 separates the digital audio ID information from the image information and reproduces the digital audio ID information, thereby making it possible to check whether or not the image information is genuine.
  • the process in which the first information processing device 40 embeds digital audio ID information in image information is the same as the process in FIG.
  • information 9 in which digital audio ID information 3 is embedded as a digital watermark is encoded (compressed) standardized as MPEG (Moving Picture Experts Group), JPEG (Joint Photographic Experts Group), etc. )
  • the processing (42) is performed, and the encoded data is encrypted (44) before being transmitted on the network.
  • the transmission path through which the encrypted data is transmitted is a disturbance that has the effect of removing or destroying the embedded watermark information.
  • the voice ID information embedded as watermark information needs to be strong enough to resist such disturbances.
  • the strength of the voice ID information is variable according to the voice parameters such as the number of quantization bits and the sampling frequency, so that the required strength can be easily set. Even if voice quality deteriorates due to lack of data, it is sufficient to have enough information to finally recognize the meaning of voice.In the case of voice data, for example, even if 50% Because it can be understood, it is excellent in terms of the strength against disturbance in terms of the sound information.
  • the voice time is also 1 to 3 seconds, it is possible to reproduce 20 to 50 words and phrases, which is enough to prove the ID.
  • the watermark strength can be easily increased by changing the audio parameters, and it is easy to improve the audio quality. Will be possible.
  • the encrypted data transmitted via the transmission path is subjected to encryption / decryption processing (47), and further, the second information processing device 41 performs decryption (decompression) processing for the MPEG code (48). ).
  • the composited information corresponds to the image information 9 in which the digital watermark is embedded.
  • the image information is used for electronic approval or digital signature, whether or not the image information is genuine is determined at the level of the visible information and further at the level of the invisible digital watermark. For example, when it is necessary to determine at the level of the invisible digital watermark, as described with reference to FIG. 6, the voice ID information is separated through the frequency conversion process 26 and the watermark extraction process 24, and separated.
  • the noise may be removed from the obtained information by filtering (50), voice recognition may be performed (51), and the meaning of the voice of the ID may be recognized. If the audio ID information before embedding is encoded, the information of 22 is decoded to obtain the audio ID information.
  • FIG. 9 shows the flow of processing by the information processing apparatus in FIG. 8 in detail.
  • FIG. 10 highlights the embedding location when audio ID information is multiplexed and embedded as a digital watermark in image information.
  • Area 61 is a voice ID information area of the copyright holder
  • 62 is a voice ID information area of the owner of the original image information
  • 63 is a voice ID information area of the authorized duplication right holder of the original image information.
  • unique digital voice ID information is embedded one after another as digital watermark information.
  • 6 1D, 6 2D, 6 3D Means the segment data of the voice ID information.
  • the next digital watermark information is embedded in a new embedding center shifted from that. That is, in each of the regions 61, 62, and 63, the position of the center of gravity of each region can be determined from the position data of the embedded audio ID information.
  • a new embedding position may be determined so as to deviate from the existing center of gravity position.
  • a place for storing the history or the latest position may be secured in the information 5 or 9.
  • Fig. 11 shows an example of embedding a digital watermark with audio ID information in electronic approval documents and electronic identification documents.
  • the shadow image information of the digital approval marks 71 and 72 which are liable to be illegally copied, is used as the original image information, in which audio ID information is embedded as digital watermark data.
  • the digitized face photo 84 which is liable to be illegally copied, is used as the original image information, in which audio ID information is embedded as digitized watermark data.
  • Figure 12 illustrates an overview of an electronic commerce or electronic approval system using voice ID information.
  • the copyright owner 82 creates a sound ID information bulletin 83 that only the user can know.
  • the voice ID information 83 is added to the copyright information A 85 produced by the copyright holder 82 by the embedding method shown in FIG. 1 or FIG. 2 in the product sales department (or electronic approval creation department) 84.
  • a program 86 for embedding the voice ID information 83 new copyright information B 87 with the voice ID information embedded is obtained.
  • This copyright information B877 is used as an approved product or information as a distribution product. This work When it is necessary to check whether the right information B 87 is a genuine or illegally used item, the checking organization 90 performs an illegal judgment.
  • the copyrighted work B 88 to be checked is passed through the detection program 91 for extracting the digital voice ID information by the voice ID information detection and separation method shown in FIG. 5 or FIG. 2 is detected and compared with the voice ID information registered in advance in the database 96 of the management unit 93, and the pass / fail is determined.
  • the copyright holder 82 creates the voice ID information 83, but it is not always necessary for the copyright owner 82 to create it, and the copyright manager 95 creates it. Is also good.
  • the management department 93, the production department 81, and the check department 90 are configured separately, they need not necessarily be configured separately. Alternatively, any two departments may be integrated.
  • a digital watermark is used only by the copyright holder in the event that encrypted information that would otherwise have no meaning in an unauthorized copy is decrypted in an unauthorized manner and that is illegally copied as restoration information. It is for extracting digital watermark information by comparing it with original information possessed, claiming copyright, or identifying an unauthorized copyist. Therefore, the audio ID watermark information extraction processing shown in FIGS. 8 and 9 is often used only when investigating fraud. The extraction process is performed by, for example, a watermark management company.
  • the product sales department (or electronic approval creation department) 84 can be regarded as a first information processing device, and the check organization 90 can be regarded as a second information processing device.
  • the first information processing device 84 executes the audio ID embedding program 86 to generate and output the copyright information B 87 that incorporates the digital audio ID information into the copyright information A 85 .
  • Second information processing device 90 inputs the copyright information B 87, extracts the digital voice ID information 92 from the information 87 as necessary, and can compare this with the ID information of the genuine copyright holder registered in the database in advance To
  • FIG. 13 shows an example of an information processing device (also referred to as a combination device) that uses the audio ID information for digital watermarking, and an example of an information processing network including the device.
  • an information processing device also referred to as a combination device
  • uses the audio ID information for digital watermarking and an example of an information processing network including the device.
  • the information processing network shown in Fig. 13 is a LAN (local area network), a WAN (wide area-network) such as an Internet network, and a wireless communication network.
  • a system, and the one indicated by 104 indicates a transmission medium such as an optical fiber, an ISDN line, or a wireless line in the system.
  • the transmission medium 104 includes, but is not limited to, a host computer device 103, and terminal computer devices 100, 101, and 102, typically shown via communication adapters 105, 106, and 107 such as routers and terminal adapters.
  • the connected c-terminal computer device 100 includes, but is not limited to, a semiconductor integrated circuit processor (MPU) 109, and a display controller (DI SPC) 113 on an external bus 108, and a network.
  • MPU semiconductor integrated circuit processor
  • DI SPC display controller
  • Network controller (NE TC) 114 and DRAM 115 and a disk controller connected to a peripheral circuit (not shown) built in the data processor 109.
  • FDC field controller
  • KY C keyboard controller
  • IEC integrated device 'electronics' controller
  • the DISPC 113 controls drawing on the video RAM (VRAM) 121, and displays the drawn display data on the display (DISP) 120.
  • the NE TC 114 is connected to the communication adapter 105, and performs transmission / reception information buffering and communication protocol control.
  • DRAM 1 15 It is used for the program area and work area of the mouth processor 109.
  • the FDC 110 is connected to a floppy disk drive device 116 to read information from and write information to a floppy disk 130 which is an example of a recording medium.
  • a keyboard 117 is connected to the KEYC 111.
  • a hard disk drive (HDD) 118 and a CD-ROM drive (CDRD) 119 are connected to IDE C 112.
  • the HDD 118 has a magnetic disk which is another example of the recording medium.
  • the CD RD 119 has a CD-ROM 131 which is another example of the recording medium.
  • the other terminal computer devices 101 and 102 have the same configuration as above.c For example, when the digital audio ID information embedding process of FIG. 1 or FIG. 2 is performed using the terminal computer device 100, or When the digital voice ID information extraction processing shown in FIG.
  • a program for that is installed from the floppy disk 130 or the CD-OM 131 into the hard disk drive 118 by a user, for example.
  • the program is recorded in the floppy disk 130 and the CD-ROM 131 in advance.
  • the set-up of the terminal computer 100 may provide the program preinstalled on the hard disk drive.
  • the data processor 109 When executing the installed program, the data processor 109 loads the program into the DRAM 115, and sequentially fetches and executes instructions from the DRAM 115. A part of the program stored in the CD-ROM 131 can be directly taken out from the CD-ROM and executed.
  • the terminal combination device 100 can install the program via the floppy disk 130 or the like, or can execute the program directly from the hard disk drive device 118 or the like. Therefore, according to the recording media 130 and 131 storing the program, the original Embedding of ID information by digital watermarking and embedded ID information, which has a small amount of embedded information relative to image information and is strong against disturbance to embedded original information and against missing information This can be easily realized by the terminal computer 100 as an example of the information processing device.
  • the terminal-viewing device 100 can download the program from the host computer device 103.
  • the host convenience device 103 holds, for example, the compressed program in a hard disk device or the like.
  • the terminal computer 100 designates the program and instructs the download, whereby the program is transmitted to the transmission medium 94. And transmitted to the hard disk drive 1118 of the terminal combination device 100.
  • the downloaded program is then decompressed and installed in a predetermined program storage area. In this manner, the terminal computer 100 can easily acquire the program for embedding or separating the digital audio ID information via the transmission medium 94 on a network.
  • the transmission medium 94 has a small amount of embedded information relative to the original image information as described above, and is strong against disturbance to the embedded original information and also against information loss. It is useful for the terminal computer 100 to easily realize embedding of the ID information by the digital watermark and separation of the embedded ID information.
  • the recording medium such as the floppy disk 130 and the CD-ROM 131 is a medium for recording and distributing the image information in which the digital audio ID information is embedded in the original image information as digital watermark information. is there.
  • the transmission medium 104 can store the image information obtained by embedding the digital audio ID information in the original image information as digital watermark information in an information processing apparatus such as an information processing apparatus 100. It is also a medium for transmission to devices.
  • the information processing apparatus 100 or the like that inputs the image information recorded on the recording mediums 130 and 131 or the image information transmitted via the transmission medium 104 is provided with a digital watermark as necessary. It is possible to relatively easily realize the process of separating the digital voice ID information as information and confirming the content even in a low-cost information processing device.
  • Digital voice ID information is adopted as digital watermark information having meaning as ID information, so even if there is information loss of, for example, 50%, it can be recognized as meaning and watermark strength strong against disturbance can be easily realized. .
  • the quality of the audio reproduced and extracted may be the lowest quality that can be understood.
  • the number of quantization bits is 4 bits and the sampling frequency is 4 kHz
  • a watermark data amount of 16 kbit / s (uncompressed) is sufficient.
  • a reproduction time of 1 to 3 seconds is enough for use as ID information.
  • Digital audio ID information is used as digital watermark information that has meaning as ID information, so even if noise is included in the extracted audio ID watermark information, it is only necessary to be able to recognize the meaning, and it is somewhat difficult to recognize the meaning. Even in a simple state, the meaning can be easily recognized and improved by removing noise by filtering. Therefore, the process of extracting the watermark information is easy, and the information processing device for that can be handled at a simple cost.
  • By adding automatic voice recognition automatic recognition of ID information is also possible.
  • Digit audio ID information does not have to be embedded in a position pattern like embedding image ID information. It is also easy to embed multiple in the original image information. Even for complicated rights related to digital image information, the ID information of the copyright holder or the rights holder can be easily and separably embedded.
  • the frequency transform is not limited to the above, and may be a wavelet transform.
  • the basic embedding algorithm of the watermark information is not limited to the above, and the watermark information may be embedded by dividing the original image information into a number of blocks.
  • the present invention can be widely applied to various electronic approval, electronic commerce, and electronic distribution systems, or various information processing devices such as personal computers and mobile communication terminals used for the systems.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Editing Of Facsimile Originals (AREA)

Abstract

Digital audio ID information (3) having a meaning of ID information is embedded as invisible electronic watermark information in a sampled value region of digital original image information (1) or in a frequency region where the digital original information is subjected to frequency conversion, and the embedded digital audio ID information is made separable from the sampled value region or the frequency region by an information processing device (41), thereby realizing a function of inhibiting illegal copy of digital image information. Since digital audio ID information is adopted as electronic watermark information having a meaning of ID information, the information is recognized as a meaning even if, e.g., 50% of the information is lost. Further the watermark strength against disturbance is high. The strength against disturbance is desirably and simply selected according to a request by readily changing the audio parameters, such as the number of quantized bits and sampling frequency. ID information can be relatively easily recognized by using an electronic watermark even by a low-cost system.

Description

明 細 情報処理方法及び記録媒体 技術分野  Description Information processing method and recording medium
本発明は、ディジ夕ル情報コンテンツ提供者等が作成した原画像情報 に対して著作権等を明確にするための原画像情報への電子透かし埋め 込み技術に関し、 電子透かしを埋め込むための情報処理方法、 情報処理 装置、そして電子透かしを埋め込んだ情報を用いる電子承認若しくは電 子取り引きのための情報処理システム等に適用して有効な技術に関す る。 背景技術  The present invention relates to a technology for embedding a digital watermark in original image information for clarifying a copyright or the like of original image information created by a digital information content provider or the like, and to information processing for embedding a digital watermark. The present invention relates to a method, an information processing device, and a technology effective when applied to an information processing system for electronic approval or electronic transaction using information in which an electronic watermark is embedded. Background art
電子透かし技術について記載された文献の例として、日経 B P社発行 の日経エレク トロニクス 1 9 9 7 . 2 . 2 4 ( n o . 6 8 3 ) の第 9 9 頁〜第 1 2 4頁がある。電子透かし技術とは、 画像や音声などのマルチ メディアデータに何らかの情報を埋め込み、隠し持たせる技術である。 電子的著作物等のどこかに不可視の電子透かし情報が埋め込まれて残 り続けていれば著作物に対する不正コピ一を抑止する効果が期待でき る。  As an example of a document describing the digital watermarking technology, there are pages 199 to 124 of Nikkei Electronics 1997.2.24 (no. 683) published by Nikkei BP Company. Digital watermarking technology is a technology that embeds and hides some information in multimedia data such as images and sounds. If the invisible digital watermark information is embedded and continues to remain in some places such as an electronic work, the effect of suppressing unauthorized copying of the work can be expected.
従来、 画像や音声、 テキス トなどに電子透かしを埋め込む場合、 情報 著作権保有者各個人や原情報の流通権利を所有する各会社に対応して 符号化された I D ( Identification)番号を暗号化して原情報に埋め込 んだり、 あるいは画像データや音声データ、 テキス トなどの情報そのも のと 1対 1に対応する符号化された I D番号が暗号化され、原情報に埋 め込まれていた。 いずれも、 埋め込まれる I D情報そのもののデータ形 式は、 画像またはテキス ト情報として埋め込まれていた。その時の I D 番号の検出には、 鍵デ一夕を使用する。 I D番号を埋め込んだ場所を特 定できる情報を鍵デ一夕に持たせている。 Conventionally, when embedding a digital watermark in images, audio, text, etc., the ID (Identification) number encoded for each individual information copyright holder and each company that has the distribution right of the original information is encrypted. ID information that is one-to-one corresponding to information such as image data, audio data, and text is encrypted and embedded in the original information. Was. In each case, the data format of the embedded ID information itself The formula was embedded as image or text information. The key number is used to detect the ID number at that time. The key information is provided with information that can identify the location where the ID number is embedded.
電子透かし技術を上記不正コピーの抑止とは別の観点で適用した技 術として、特開平 1 0— 2 9 0 4 2 4号公報に記載のビデオ装置がある ( これに記載のビデオ装置は、複数の言語に対応した字幕や音声のディジ タル信号をディジタル化した映像信号中にディジタル透かし技術によ つて埋め込むようにしたものである。 これにより、 記録形式を変更する ことなく、複数の言語による文字情報や音声情報の記録が可能になり、 再生時は映像信号から取り出した字幕や音声のディジタル信号より任 意のものを選んで再生することができるようになる。これにおける電子 透かし技術の目的は、映像情報に関連する字幕若しくは音声情報を電子 透かし技術を利用して、 効率的に埋め込むことである。 したがって、 電 子透かし技術で埋め込まれた字幕や音声情報は明瞭に復元でき、各出演 者の声や音が各々識別できることなど、再生情報の品質を保つことがで きる電子透かしでなければならない。 この意味において、 それは上記著 作権等を保護する目的の I Dとしての電子透かし技術とは全く異なる ものである。  As a technology in which the digital watermarking technology is applied from a viewpoint different from the suppression of the illegal copying, there is a video device described in Japanese Patent Application Laid-Open No. H10-290404 (the video device described in It is designed to embed subtitles and audio digital signals in multiple languages into digital video signals using digital watermarking technology, so that multiple languages can be used without changing the recording format. This makes it possible to record textual information and audio information, and at the time of playback, it is possible to select and reproduce any subtitles or audio digital signals extracted from the video signal. Is to efficiently embed subtitles or audio information related to video information using digital watermarking technology. And the audio information must be digital watermarks that can clearly restore and maintain the quality of the reproduced information, such as the voices and sounds of each performer can be distinguished from each other. This is completely different from digital watermarking technology as an ID for protection.
また、 特開平 1 1一 1 6 8 6 1 6号公報には画像、 音声、 テキス 卜な どの情報を埋め込み情報として使用する画像情報処理技術について記 載がある。 それら埋め込み情報は、 可視情報として原情報に重畳される 情報であり、原情報に関連して付随する情報を埋め込むことを目的とす る技術である。 したがって、 埋め込み情報が原情報に加わるマルチメデ ィァ情報の一つとなっており、埋め込み情報も積極的に利用者に顕在化 させるべき情報である点において、著作権保護等を目的として原情報の I D情報を埋め込む技術とは全く異なっている。例えば、 その技術を電 子絵本に応用する場合、被埋め込み画像情報として動物園の風景写真を 用いるとき、 例えばライオンの写真にはライオンの泣き声、 ペンギンの 写真にはペンギンの生態についての情報を埋め込むというように、被埋 め込み画像情報と埋め込み画像情報とが比較的連想し易い関連付けを 行うことにより、複数の埋め込み情報の中から自分の欲する埋め込み情 報を探し易くなる。 Further, Japanese Patent Application Laid-Open No. 11-166686 discloses an image information processing technique using information such as images, sounds, and text as embedded information. The embedded information is information to be superimposed on the original information as visible information, and is a technique for embedding information accompanying the original information. Therefore, the embedded information is one type of multimedia information that is added to the original information, and the embedded information is information that should be actively exposed to users. It is completely different from information embedding technology. For example, the technology When applied to children's picture books, when a landscape image of a zoo is used as the embedded image information, for example, a lion's cry is embedded in a lion's picture, and information about the ecology of a penguin is embedded in a penguin's picture. By associating the embedded image information with the embedded image information relatively easily associated with each other, it becomes easy to search for the embedded information desired by the user from a plurality of embedded information.
I Dとして使用される電子透かし技術について本発明者が検討した ところによれば、 更に以下の点が明らかにされた。 従来、 I Dとして使 用されていた電子透かし情報は、画像若しくはテキス ト又はそれらを暗 号化した情報であった。 ここで、 上記 I D情報としての意味を持った電 子透かし情報が埋め込まれた画像情報などは、 例えば、 ネッ トワークビ ジネス等における情報の発信から着信までの途中過程において、伝送途 中におけるノイズ付加、 伝送途中の情報欠落、 暗号化処理、 符号圧縮化 処理、 暗号解読化処理、 符号伸長化処理など、 埋め込まれた透かし情報 にとつては、 様々な外乱が入る。 そのため、 電子透かし情報は、 それら 外乱に強くなければならない、 とされていた。著作権を保護することが 目的であるため、 透かし情報を取り出した場合、 I D情報としては、 最 終的には正確に意味確認できることが必要だからである。そうかといつ て、 電子透かし情報の強度を高くするにも自ずから限度がある。電子透 かし情報は原情報に付加されるため、 出来る限り、 埋め込みによって原 情報に影響を与えないことが望ましいからである。電子透かし情報の強 度を高め、 しかも原情報への影響を最小限に抑えるために、 透かし処理 が複雑で重い処理になれば、それだけ情報処理システムが高価格になり、 パーソナル情報機器への組み込み等が困難になる。  According to the study of the digital watermark technique used as the ID by the present inventors, the following points have been further clarified. Conventionally, digital watermark information used as an ID is an image, text, or information obtained by encrypting them. Here, the image information in which the electronic watermark information having the meaning as the ID information is embedded is, for example, noise addition during transmission during transmission and reception of information in a network business, etc. Various disturbances occur in embedded watermark information such as information loss during transmission, encryption processing, code compression processing, decryption processing, and code decompression processing. Therefore, digital watermark information must be resistant to such disturbances. This is because the purpose is to protect copyright, and when watermark information is extracted, it is necessary that the meaning of the ID information can be finally confirmed accurately. However, there is a natural limit to increasing the strength of digital watermark information. Because the electronic watermark information is added to the original information, it is desirable that embedding does not affect the original information as much as possible. In order to increase the strength of digital watermark information and minimize the influence on the original information, the more complicated and heavy the watermark processing becomes, the higher the price of the information processing system becomes. Etc. becomes difficult.
本発明の目的は、 従来の不可視電子埋め込み技術に比べて、原画像情 報に対する相対的な埋め込み情報量が少なく、埋め込まれた原情報への 外乱に対しても、 また、 情報欠落に対しても強い、 I D情報としての電 子透かしを実現する情報処理方法、その情報処理方法の実現に寄与する プログラムの記録媒体及び伝送媒体、そして情報処理装置及び情報処理 システムを提供することにある。 An object of the present invention is to reduce the amount of embedding information relative to original image information as compared with the conventional invisible electronic embedding technology, An information processing method that realizes an electronic watermark as ID information that is resistant to disturbance and information loss, a recording medium and a transmission medium for a program that contributes to the realization of the information processing method, and an information processing method It is to provide an apparatus and an information processing system.
本発明の別の目的は、低コス 卜なシステムでも電子透かしによる I D 情報を比較的容易に認識することを可能にする情報処理方法、その情報 処理方法の実現に寄与するプログラムの記録媒体及び伝送媒体、そして 情報処理装置及び情報処理システムを提供することにある。  Another object of the present invention is to provide an information processing method capable of relatively easily recognizing ID information by a digital watermark even in a low-cost system, a recording medium of a program contributing to the realization of the information processing method, and transmission. It is to provide a medium, and an information processing device and an information processing system.
本発明の更に別の目的は、 情報処理装置に、 原画像情報に対する相対 的な埋め込み情報量が少なく、埋め込まれた原情報への外乱に対しても、 また、 情報欠落に対しても強い、 電子透かしによる I D情報の埋め込み と埋め込まれた I D情報の再生とを簡単に実現させることができるプ ログラムを格納した記録媒体を提供することにある。  Still another object of the present invention is to provide an information processing apparatus that has a small amount of embedded information relative to original image information, and is strong against disturbance to the embedded original information and against information loss. An object of the present invention is to provide a recording medium storing a program capable of easily realizing embedding of ID information by a digital watermark and reproduction of the embedded ID information.
本発明の上記並びにその他の目的と新規な特徴は本明細書の以下の 記述と添付図面から明らかにされるであろう。 発明の開示  The above and other objects and novel features of the present invention will become apparent from the following description of the present specification and the accompanying drawings. Disclosure of the invention
〔 1〕電子透かし情報の埋め込みという観点に立った情報処理方法は、 ディジ夕ル原画像情報に、即ち、ディジ夕ル原画情報の標本値の領域(標 本値領域) に、 I D情報としての意味を持ったディジタル音声 I D情報 を不可視電子透かし情報として分離可能に埋め込むことを特徴とする ものである。上記情報処理方法の更に詳しい態様の情報処理方法は、 デ ィジ夕ル原画像情報に電子透かし情報を埋め込むための複数の埋め込 み位置をディジ夕ル原画像情報を参照して決定する埋め込み位置決定 処理と、決定された埋め込み位置に対する埋め込みの順番を決定する順 序決定処理と、前記埋め込みの順番に従って前記埋め込み位置に I D情 報としての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし 情報として埋め込む処理と、 を含む。 [1] The information processing method from the viewpoint of embedding digital watermark information uses the ID information as the ID information in the digitized original image information, that is, in the sample value area (sample value area) of the digitized original image information. It is characterized by embedding separable digital voice ID information as invisible digital watermark information. An information processing method according to a more detailed aspect of the above information processing method includes an embedding method for determining a plurality of embedding positions for embedding digital watermark information in digital original image information by referring to the original digital image information. A position determination process, an order determination process for determining an embedding order for the determined embedding position, and an ID information is stored in the embedding position in accordance with the embedding order. And embedding digital voice ID information having meaning as information as invisible digital watermark information.
上記した手段によれば、不可視電子透かし情報としてのディジ夕ル音 声 I D情報は、 一般には、 量子化ビッ ト数を大きく、 サンプリング周期 は小さく、 音声データ数は多い程、 音声 I Dの透かしの強さを強くでき る。 音声 I Dの特徴として、 最終的に透かしを取り出した時に、 その音 声の意味が、 人 (チェック者) により、 あるいは機械により認識 (意味 理解) できれば良く、 音声品質 (音質、 ノイズ度、 性別識別、 個性識別 など) は、 悪くても充分である。 再生時、 ディジ夕ル音声 I D情報に数 1 0 %の情報欠落があっても、 意味として認識できる。 したがって、 原 画像情報に対する相対的な埋め込み情報量が少なく、埋め込まれた原画 情報への外乱に対しても、 また、 情報欠落に対しも強い、 I D情報とし ての電子透かしを実現することができる。 また、 量子化ビッ ト数、 サン プリング周波数などの音声パラメ一夕を容易に変化させることによつ て、 原画像情報の情報量に応じて若しくは必要に応じて、 外乱への強さ を所望に選択することが容易である。 それらにより、 低コス 卜なシステ ムでも電子透かしによる I D情報の認識を比較的容易に実現させるこ とが可能になる。  According to the above-described means, the digital audio ID information as invisible digital watermark information generally has a larger number of quantization bits, a smaller sampling period, and a larger number of audio data, the larger the audio ID watermark. Strength can be increased. As a feature of the voice ID, it is only necessary that the meaning of the voice when the watermark is finally extracted can be recognized (meaning understanding) by a person (checker) or by a machine, and the voice quality (sound quality, noise level, gender identification) , Individuality identification, etc.) are good enough. During playback, even if there is a lack of information of several 10% in the digital audio ID information, it can be recognized as meaning. Therefore, the amount of embedded information relative to the original image information is small, and it is possible to realize a digital watermark as ID information that is strong against disturbance to the embedded original image information and also against information loss. . In addition, by easily changing the audio parameters such as the number of quantization bits and the sampling frequency, it is desirable to obtain the strength against disturbance according to the information amount of the original image information or as necessary. It is easy to choose. These make it possible to realize ID information recognition by digital watermarking relatively easily even in a low-cost system.
ディジ夕ル原画情報の周波数変換された周波数変換値の領域(周波数 領域) に着目する情報処理方法は、 ディジ夕ル原画像情報を周波数変換 する変換処理と、前記変換処理で得られた所定周波数成分に I D情報と しての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報 として埋め込む埋め込み処理と、前記埋め込み処理を経た情報に対して 前記周波数変換に対する逆変換を行う逆変換処理と、を含むことを特徴 とする。 この情報処理方法の更に詳しい態様の情報処理方法は、 ディジ 夕ル原画像情報を周波数変換する変換処理と、前記変換処理で得られた 各周波数成分の中から電子透かし情報を埋め込むための複数の埋め込 み周波数成分を埋め込み位置として決定する埋め込み位置決定処理と、 決定された複数の埋め込み位置に対する埋め込みの順番を決定する順 序決定処理と、前記埋め込みの順番に従って前記埋め込み位置に I D情 報としての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし 情報として埋め込む処理と、前記埋め込み処理を経た情報に対して前記 周波数変換に対する逆変換を行う逆変換処理と、 を含む。 The information processing method which focuses on the frequency conversion value region (frequency domain) of the digitized original image information is performed by a frequency conversion process of the digitized original image information and a predetermined frequency obtained by the conversion process. An embedding process of embedding digit voice ID information having meaning as ID information in the component as invisible digital watermark information, and an inverse conversion process of performing an inverse conversion for the frequency conversion on the information after the embedding process. , Is included. An information processing method according to a more detailed aspect of this information processing method includes a conversion process of performing frequency conversion on digital original image information and a conversion process obtained by the conversion process. An embedding position determining process for determining, as an embedding position, a plurality of embedding frequency components for embedding digital watermark information from each frequency component, and an order determining process for determining an embedding order for the determined plurality of embedding positions. A process of embedding digital audio ID information having meaning as ID information at the embedding position as invisible digital watermark information in accordance with the embedding order; and performing an inverse process for the frequency conversion on the information having passed the embedding process. And inverse conversion processing for performing the conversion.
周波数領域に音声 I D情報を埋め込めば、標本値領域に埋め込む場合 に比べて、 情報処理量は僅かに増えるが、 透かし強度を容易に増すこと ができる。  Embedding audio ID information in the frequency domain slightly increases the amount of information processing compared to embedding in the sample value domain, but can easily increase the watermark strength.
前記ディジ夕ル音声 I D情報の量子化ビッ ト数、サンプリング周波数、 及び音声時間を規定するディジ夕ル音声構成パラメ一夕を前記ディジ タル原画像情報に基いて算出するパラメ一夕算出処理を更に含んでよ い。再生時にも必要なデイジ夕ル音声構成パラメ一夕を原画像情報から 直接取得することが可能である。 これに代えて、 前記ディジ夕ル音声構 成パラメ一夕を予め複数種類テーブルなどに用意しておき、必要な透か し強度に応じてパラメ一夕を選択するようにしてもよい。  A parameter calculation process for calculating a digital voice configuration parameter for defining the number of quantization bits, a sampling frequency, and a voice time of the digital voice ID information based on the digital original image information. May be included. Even during playback, it is possible to directly obtain the required daisy-voice audio configuration parameters from the original image information. Instead of this, it is also possible to prepare in advance a plurality of kinds of parameters for the digital voice configuration in a table or the like, and to select the parameters in accordance with the required watermarking strength.
前記ディジタル音声 I D情報は、 例えば、 3ビッ ト乃至 1 6ビッ トか ら選ばれた何れか 1つの量子化ビッ ト数、 3〜 4 0 k H zの範囲から選 ばれたサンプリング周波数の音声パラメ一夕によって特定されるよう な、 比較的低い精度の音声情報であっても、 I D認識には充分である。 同様に、 前記ディジ夕ル音声 I D情報は、 1 ビッ トオーバサンプリング 方式で量子化されて形成され、 3〜4 0 k H zのサンプリング周波数の 音声パラメ一夕によって特定される音声情報であっても不都合はない。 ディジ夕ル音声 I D情報は、 画像 I Dの埋め込みのように、 原画の位 置パターンに沿って埋め込むことに限定されないので、原画像情報に、 多重に埋め込むことが容易である。即ち、 画像 I D情報は人間の知覚上 重要でない所謂冗長な部分にノィズとして埋め込むというような画像 の位置パターンに依存して埋め込み位置が決定される。ディジ夕ル音声 I D情報の場合には情報量も少なく本来的に不可視情報であるから、多 重に埋め込み易い。 このとき、 既に埋め込み済みの複数の埋め込み位置 の重心とは異なる重心を採るように今回の音声 I D情報の埋め込み位 置を決定すれば、 I D情報の再生が楽になる。 The digital voice ID information includes, for example, one of the number of quantization bits selected from 3 bits to 16 bits and a voice parameter of a sampling frequency selected from a range of 3 to 40 kHz. Even relatively low-accuracy speech information, such as that specified by an overnight, is sufficient for ID recognition. Similarly, the digitized voice ID information is quantized and formed by a 1-bit oversampling method, and is voice information specified by voice parameters having a sampling frequency of 3 to 40 kHz. There is no inconvenience. Since the digital audio ID information is not limited to embedding along the position pattern of the original image like embedding the image ID, It is easy to embed multiple. In other words, the embedding position is determined depending on the position pattern of the image such that the image ID information is embedded as noise in a so-called redundant portion that is not important for human perception. In the case of digital audio ID information, the amount of information is small and inherently invisible information, so it is easy to embed multiple times. At this time, if the embedding position of the current voice ID information is determined so that the center of gravity differs from the centroid of a plurality of embedding positions that have already been embedded, reproduction of the ID information becomes easy.
〔2〕電子透かし情報の再生という観点に立った情報処理方法は、 ディ ジ夕ル原画情報の標本値領域にディジタル音声 I D情報が埋め込まれ ている場合、ディジ夕ル原画像情報に I D情報としての意味を持ったデ ィジ夕ル音声 I D情報を不可視電子透かし情報として分離可能に埋め 込んだ情報から、前記ディジ夕ル音声 I D情報を分離する分離処理と、 分離したディジ夕ル音声 I D情報を再生する音声再生処理とを含むこ とを特徴とする。音声再生は、 分離されて再構成されたディジ夕ル音声 I D情報に対して、 音声構成パラメ一夕を用いて行い、 再生時のデータ 順はディジ夕ル音声 I D情報の埋め込み順に従う。  [2] The information processing method from the viewpoint of the reproduction of digital watermark information is based on the case where digital audio ID information is embedded in the sample value area of the original digital image information, as ID information in the original digital image information. Separation processing for separating the digital audio ID information from the information embedded with the separable digital audio ID information as invisible digital watermark information, and the separated digital audio ID information And a sound reproduction process for reproducing the sound. The audio playback is performed using the audio configuration parameters for the separated and reconstructed digital audio ID information, and the data order during reproduction follows the embedding order of the digital audio ID information.
ディジ夕ル原画情報の周波数変換された周波数領域にディジ夕ル音 声 I D情報が埋め込まれている場合、 再生のための情報処理方法は、 デ ィジ夕ル原画像情報の所定周波数成分に I D情報としての意味を持つ たディジ夕ル音声 I D情報が不可視電子透かし情報として埋め込まれ ている情報に対して周波数変換を行う周波数変換処理と、周波数変換処 理で得られた情報から前記ディジタル音声 I D情報を分離する分離処 理と、分離したディジ夕ル音声 I D情報を再生する音声再生処理とを含 む。  When digitized voice ID information is embedded in the frequency domain of the digitized original image information that has been frequency-converted, the information processing method for reproduction is based on the fact that an ID is assigned to a predetermined frequency component of the digitized original image information. Frequency conversion processing for performing frequency conversion on information in which digital audio ID information having meaning as information is embedded as invisible digital watermark information, and the digital audio ID based on information obtained by the frequency conversion processing. It includes a separation process for separating information and a sound reproduction process for reproducing the separated digital audio ID information.
電子透かし情報としてディジ夕ル音声 I D情報を認識すればよいか ら、低コス 卜なシステムでも電子透かしによる I D情報の認識が比較的 容易に実現可能になる。 Since it is sufficient to recognize digital voice ID information as digital watermark information, it is relatively difficult to recognize ID information by digital watermark even in a low-cost system. It can be easily realized.
〔3〕更に別の観点による発明は、 コンピュータ装置などの情報処理装 置に前記情報処理方法を実現するためのプログラムを提供するもので ある。  [3] The invention according to another aspect provides a program for implementing the information processing method on an information processing device such as a computer device.
プログラムの記録媒体はプログラムを静的に記録する C D - O M 等の媒体である。伝送媒体は有線回線又は無線回線にて接続されたネッ トワークを介してプログラムを電子的、電磁気的又は光学的に配布又は 流通をさせるための、前記プログラムを動的に伝送する通信媒体である ( I D情報を標本値領域に埋め込む場合、 プログラムの記録媒体は、 デ ィジ夕ル原画像情報に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報として分離可能に埋め込む処理と、前 記埋め込まれた情報から前記ディジ夕ル音声 I D情報を分離する処理 と、を情報処理装置に実行させるためのプログラムを情報処理装置に読 み取り可能に記録したものである。 The recording medium of the program is a medium such as a CD-OM for statically recording the program. The transmission medium is a communication medium that dynamically transmits the program for electronically, electromagnetically, or optically distributing or distributing the program via a network connected by a wired line or a wireless line ( When embedding the ID information in the sample value area, the recording medium of the program uses the process of embedding the digital audio ID information, which has meaning as ID information, in the original digital image information in a separable manner as invisible digital watermark information. And a process for separating the digitized voice ID information from the embedded information described above, and a program for causing the information processing device to execute the program is recorded in the information processing device in a readable manner.
I D情報を周波数領域に埋め込むことを考慮するなら、記録媒体は、 ディジ夕ル原画像情報を周波数変換する第 1変換処理と、前記第 1変換 処理で得られた所定周波数成分に I D情報としての意味を持ったディ ジ夕ル音声 I D情報を不可視電子透かし情報として埋め込む埋め込み 処理と、前記埋め込み処理を経た情報に対して前記周波数変換に対する 逆変換を行う逆変換処理と、逆変換された情報を周波数成分に変換する 第 2変換処理と、第 2変換処理で得られた情報から前記デイジ夕ル音声 I D情報を分離する分離処理と、を情報処理装置に実行させるためのプ ログラムを情報処理装置に読み取り可能に記録したものである。  In consideration of embedding the ID information in the frequency domain, the recording medium includes a first conversion process for frequency-converting the original digital image information and a predetermined frequency component obtained in the first conversion process as ID information. Embedding processing for embedding meaningful digital audio ID information as invisible digital watermark information; inverting processing for performing inverse conversion on the frequency conversion of the information that has undergone the embedding processing; A program for causing an information processing device to execute a second conversion process for converting to a frequency component and a separation process for separating the digital audio ID information from the information obtained in the second conversion process is executed by the information processing device. It is recorded so that it can be read.
I D情報を標本値領域に埋め込む場合には、プログラムの伝送媒体は、 ディジ夕ル原画像情報に I D情報としての意味を持ったディジ夕ル音 声 I D情報を不可視電子透かし情報として分離可能に埋め込む機能と、 前記埋め込まれた情報から前記ディジ夕ル音声 I D情報を分離する機 能と、を情報処理装置に実現させるためのプログラムを伝送するもので ある。 When embedding the ID information in the sample value area, the transmission medium of the program embeds the digitized audio ID information having meaning as the ID information in the original digitized image information so as to be separable as invisible digital watermark information. Features and And a function of separating the digitized voice ID information from the embedded information in an information processing apparatus.
I D情報を周波数領域に埋め込むことを考慮するなら、 伝送媒体、 デ イジ夕ル原画像情報を周波数変換する第 1変換処理機能と、前記第 1変 換処理機能で得られた所定周波数成分に I D情報としての意味を持つ たディジ夕ル音声 I D情報を不可視電子透かし情報として埋め込む埋 め込み処理機能と、前記埋め込み処理機能で得られた情報に対して前記 周波数変換に対する逆変換を行う逆変換処理機能と、逆変換された情報 を周波数成分に変換する第 2変換処理機能と、第 2変換処理機能で得ら れた情報から前記ディジ夕ル音声 I D情報を分離する分離処理機能と、 を情報処理装置に実現させるためのプログラムを伝送する。  To consider embedding the ID information in the frequency domain, the transmission medium, the first conversion processing function for frequency conversion of the original image information, and the predetermined frequency component obtained by the first conversion processing function have the ID An embedding processing function for embedding digital audio ID information having meaning as information as invisible digital watermark information, and an inverse conversion processing for performing an inverse conversion of the frequency conversion on the information obtained by the embedding processing function. A second conversion processing function for converting the inversely converted information into frequency components, and a separation processing function for separating the digitized voice ID information from the information obtained by the second conversion processing function. A program for realizing the processing device is transmitted.
前記プログラムには、分離したディジ夕ル音声 I D情報を再生する処 理を制御若しくは実現するための処理手順の記述を含めてもよい。 前記プログラムの記録媒体及び伝送媒体によれば、原画像情報に対す る相対的な埋め込み情報量が少なく、埋め込まれた原情報への外乱に対 しても、 また、 情報欠落に対しても強い、 電子透かしによる I D情報の 埋め込みと埋め込まれた I D情報の分離とを、 情報処理装置に、 簡単に 実現させることができる。  The program may include a description of a processing procedure for controlling or realizing the processing of reproducing the separated digital audio ID information. According to the recording medium and the transmission medium of the program, the amount of embedded information relative to the original image information is small, and it is strong against disturbance to the embedded original information and also against information loss. The information processing device can easily realize the embedding of the ID information by the electronic watermark and the separation of the embedded ID information.
〔4〕本発明の別の観点による記録媒体は、原画像情報にディジ夕ル音 声 I D情報を電子透かし情報として埋め込んだ情報を記録している。即 ち、 この観点による記録媒体は、 画像情報を情報処理装置に読取り可能 に記録した記録媒体であって、 前記画像情報は、 I Dとして意味を持つ たディジ夕ル音声 I D情報が不可視電子透かし情報としてディジ夕ル 原画像情報の標本値領域に、又は前記ディジ夕ル原画像情報の周波数変 換された周波数領域に埋め込まれ、且つ埋め込まれた前記ディジタル音 声 I D情報が情報処理装置によって前記標本値領域又は周波数領域か ら分離可能にされた情報である。 詳しくは、 前記画像情報は、 ディジ夕 ル原画像情報に、 I D情報としての意味を持ったディジ夕ル音声 I D情 報が不可視電子透かし情報として分離可能に埋め込まれた情報である。 或いはまた、 前記画像情報は、 ディジ夕ル原画像情報の所定周波数成分 に I D情報としての意味を持ったディジタル音声 I D情報が不可視電 子透かし情報として埋め込まれ、周波数変換を介して前記ディジ夕ル音 声 I D情報が分離可能にされた情報である。 [4] A recording medium according to another aspect of the present invention records information in which digital audio ID information is embedded as digital watermark information in original image information. In other words, a recording medium according to this aspect is a recording medium in which image information is recorded in an information processing apparatus in a readable manner, and the image information is a digital audio ID information having a meaning as an ID, and invisible digital watermark information. The digital sound embedded and embedded in the sample value area of the original digital image information or in the frequency-converted frequency area of the original digital image information as Voice ID information is information that can be separated from the sample value area or the frequency domain by the information processing device. More specifically, the image information is information in which digital audio ID information having meaning as ID information is separably embedded as invisible digital watermark information in original digital image information. Alternatively, in the image information, digital audio ID information having a meaning as ID information is embedded as invisible electronic watermark information in a predetermined frequency component of the original digital image information, and the digital file is transmitted through frequency conversion. The voice ID information is separable information.
また、本発明の別の観点による伝送媒体はディジ夕ル原画像情報に前 記ディジタル音声 I D情報を分離可能に埋め込んだ前記画像情報を情 報処理装置に伝送する。  Further, a transmission medium according to another aspect of the present invention transmits the image information in which the digital audio ID information is separably embedded in the original digital image information to an information processing apparatus.
上記記録媒体に記録された画像情報、或いは伝送媒体を介して伝送さ れる画像情報を入力するコンビュ一夕装置のような情報処理装置は、必 要に応じて、電子透かし情報としての音声 I D情報を分離してその内容 を確認する処理を口一コス 卜の情報処理装置に対しても比較的容易に 実現させることを可能にする。  An information processing device such as a convenience store device for inputting the image information recorded on the recording medium or the image information transmitted via the transmission medium may be provided with voice ID information as digital watermark information as necessary. This makes it possible to relatively easily realize a process of separating the contents and confirming the contents of the information processing device even with a simple cost.
〔 5〕音声 I D情報を原画像情報に埋め込んだ情報を処理若しくは利用 する情報処理システムの観点による本発明は、 電子承認、 電子商取引、 情報配信等のための情報処理システムを念頭に置く。  [5] The present invention from the viewpoint of an information processing system that processes or uses information in which audio ID information is embedded in original image information takes into account an information processing system for electronic approval, electronic commerce, information distribution, and the like.
電子承認を特に考慮したとき、 情報処理システムは、 I Dとして意味 を持ったディジタル音声 I D情報が不可視電子透かし情報としてディ ジ夕ル原画像情報の標本値領域に又は前記ディジ夕ル原画像情報の周 波数変換された周波数領域に分離可能に埋め込まれた画像情報を出力 する第 1の情報処理装置と、 前記画像情報を入力し、 入力した前記画像 情報を電子的な承認情報として用いる第 2の情報処理装置とを含んで 成る。 電子承認や電子商取引で用いる電子署名を考慮したとき、情報処理シ ステムは、 I Dとして意味を持ったディジ夕ル音声 I D情報が不可視電 子透かし情報としてディジ夕ル原画像情報の標本値領域に又は前記デ ィジ夕ル原画像情報の周波数変換された周波数領域に分離可能に埋め 込まれた画像情報を出力する第 1の情報処理装置と、前記画像情報を入 力し、入力した前記画像情報を電子的な署名情報として用いる第 2の情 報処理装置とを含んで成る。 In particular, when electronic approval is considered, the information processing system may use digital voice ID information having meaning as ID in the sample value area of the original digital image information or as the invisible electronic watermark information. A first information processing apparatus for outputting image information embedded in a frequency domain subjected to frequency conversion so as to be separable, and a second information processing apparatus for inputting the image information and using the input image information as electronic approval information And an information processing device. Considering the electronic signature used in electronic approval and electronic commerce, the information processing system converts the digitized voice ID information, which has meaning as ID, into the sample value area of the digitized original image information as invisible electronic watermark information. Alternatively, a first information processing apparatus that outputs image information embedded in a frequency domain obtained by frequency-converting the digital original image information in a separable manner, and the image information that is input after inputting the image information, And a second information processing device that uses the information as electronic signature information.
情報配信を特に考慮したとき、 情報処理システムは、 I Dとして意味 を持ったディジ夕ル音声 I D情報が不可視電子透かし情報としてディ ジ夕ル原画像情報の標本値領域に又は前記ディジタル原画像情報の周 波数変換された周波数領域に分離可能に埋め込まれた画像情報を格納 し、格納した画像情報を配信要求に応答して出力する情報処理装置を含 んで成る。  When information distribution is particularly taken into account, the information processing system uses the digital audio ID information having a meaning as an ID as the invisible digital watermark information in the sample value area of the original digital image information or the digital original image information. It includes an information processing device that stores image information embedded in a frequency domain whose frequency has been converted so as to be separable, and outputs the stored image information in response to a distribution request.
上記情報処理システムによれば、 電子承認、 電子商取引、 又は情報配 信等において音声 I D情報を原画像情報に埋め込んだ情報を処理し若 しくは利用するから、低いシステムコス トによって原画像情報の不正コ ピー抑止機能を期待できるようになる。  According to the information processing system described above, information obtained by embedding the voice ID information in the original image information is processed or used in electronic approval, electronic commerce, information distribution, and the like. You can expect a function to prevent unauthorized copying.
〔 6〕上記情報処理方法や情報処理システムで用いる情報処理装置の観 点に立ったとき、 当該情報処理装置は、 入力手段と演算制御手段とを含 み、 前記入力手段はディジ夕ル元画像情報を入力し、 前記演算制御手段 は、前記入力されたディジタル原画像情報に I D情報としての意味を持 つたディジタル音声 I D情報を不可視電子透かし情報として分離可能 に埋め込み可能である。  [6] From the viewpoint of the information processing apparatus used in the information processing method and the information processing system, the information processing apparatus includes an input unit and an arithmetic control unit, and the input unit includes a digital source image. Information is input, and the arithmetic control means can embed separable digital audio ID information having meaning as ID information into the input digital original image information as invisible digital watermark information.
また、 別の観点による情報処理装置は、 記憶手段と演算制御手段とを 含み、 前記記憶手段はディジタル原画像情報を格納し、 前記演算制御手 段は、前記記憶手段に格納されたディジ夕ル原画像情報に周波数変換を 行い、前記周波数変換で得られた所定周波数成分に I D情報としての意 味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報として埋 め込み、前記埋め込みを経た情報に対して前記周波数変換に対する逆変 換を行うことが可能である。 Also, an information processing apparatus according to another aspect includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes a digital storage unit stored in the storage unit. Frequency conversion to original image information And embeds digital audio ID information having meaning as ID information as invisible digital watermark information in a predetermined frequency component obtained by the frequency conversion, and applies the information after the embedding to the frequency conversion. It is possible to perform an inverse transformation.
更に詳しい態様の情報処理装置は、記憶手段と演算制御手段とを含み、 前記記憶手段はディジ夕ル原画像情報を格納し、前記演算制御手段は、 前記記憶手段に格納されたディジタル原画像情報を参照して当該ディ ジタル原画像情報に電子透かし情報を埋め込むための複数の埋め込み 位置を決定し、決定した埋め込み位置に対する埋め込みの順番を決定し、 前記埋め込みの順番に従って前記埋め込み位置に I D情報としての意 味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報として埋 め込むことが可能である。  An information processing apparatus according to a more detailed aspect includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes digital original image information stored in the storage unit. , Determining a plurality of embedding positions for embedding digital watermark information in the digital original image information, determining an embedding order for the determined embedding position, and as the ID information at the embedding position according to the embedding order. It is possible to embed the digital voice ID information with the meaning of as invisible digital watermark information.
別の観点による情報処理装置の更に詳しい態様の情報処理装置は、記 憶手段と演算制御手段とを含み、前記記憶手段はディジ夕ル原画像情報 を格納し、 前記演算制御手段は、 前記記憶手段に格納されたディジ夕ル 原画像情報を周波数変換し、前記周波数変換で得られた各周波数成分の 中から電子透かし情報を埋め込むための複数の埋め込み周波数成分を 埋め込み位置として決定し、決定された複数の埋め込み位置に対する埋 め込みの順番を決定し、前記埋め込みの順番に従って前記埋め込み位置 に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可視電 子透かし情報として埋め込み、前記埋め込み処理を経た情報に対して前 記周波数変換に対する逆変換を行うことが可能である。  An information processing apparatus according to another aspect of the information processing apparatus according to another aspect includes a storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit includes the storage unit. The original digital image information stored in the means is frequency-converted, and a plurality of embedding frequency components for embedding digital watermark information from among the frequency components obtained by the frequency conversion are determined as embedding positions. The embedding order for a plurality of embedding positions is determined, and digitized voice ID information having meaning as ID information is embedded as invisible electronic watermark information in the embedding position according to the embedding order, and the embedding process is performed. It is possible to perform an inverse transformation to the frequency transformation on the information that has passed through.
このとき、 電子透かし情報を多重埋め込みする観点によれば、 前記演 算制御手段は、 前記埋め込み位置を決定するとき、 ディジ夕ル音声 I D 情報が既に不可視電子透かし倩報として埋め込まれている場合には、既 に埋め込み済みの複数の埋め込み位置の重心とは異なる重心を採るよ うに今回の音声 I D情報の埋め込み位置を決定するとよい。 At this time, from the viewpoint of embedding the digital watermark information in a multiplex manner, the operation control means determines the embedding position when the digital audio ID information is already embedded as an invisible digital watermark. Has a center of gravity that is different from the center of gravity of multiple embedding positions that have already been embedded. Thus, the embedding position of the current voice ID information may be determined.
音声 I D情報の音声パラメ一夕はディジ夕ル原画像情報の状態から 算出し、 或いはテーブルから選択するようにしてよい。後者を採用する 情報処理装置は、 入力手段と記憶手段と演算制御手段とを含み、 前記記 憶手段は、ディジ夕ル原画像情報の標本値領域又はディジ夕ル原画像情 報の周波数変換された周波数領域に不可視電子透かし情報を埋め込む ための埋め込み位置とその位置の順序を予め定めた埋め込み制御情報 と、電子透かし情報として意味を持ったディジ夕ル音声 I D情報の音声 構成パラメ一夕を予め定めた音声パラメ一夕情報とを格納する。前記入 力手段はディジ夕ル原画像情報を入力する。前記演算制御手段は、 記憶 手段から所要の音声パラメ一夕と埋め込み制御情報を選択し、ディジ夕 ル原画像情報の標本値領域又はディジタル原画像情報の周波数変換さ れた周波数領域に、前記選択した音声パラメ一夕に従ったディジ夕ル音 声 I D情報を、前記選択した埋め込み制御情報で指定される位置と順序 で埋め込むものである。  The audio parameters of the audio ID information may be calculated from the state of the original image information or selected from a table. The information processing apparatus adopting the latter includes an input unit, a storage unit, and an arithmetic control unit, and the storage unit converts the frequency of the sample value area of the original image information or the original image information. The embedding control information for embedding the invisible digital watermark information in the frequency domain and the order of the positions are predetermined, and the voice configuration parameters of the digital audio ID information having meaning as the digital watermark information are set in advance. The determined voice parameter overnight information is stored. The input means inputs digital original image information. The arithmetic control means selects required voice parameters and embedding control information from the storage means, and selects the selected audio parameters in a sample value area of digital original image information or a frequency-converted frequency area of digital original image information. And embedding the digitized voice ID information according to the selected voice parameter in the position and order specified by the selected embedding control information.
更に電子透かし情報の埋め込みに暗号化を採用してもよい。即ち、情 報処理装置は、 入力手段と記憶手段と演算制御手段とを含み、 前記記憶 手段は、ディジ夕ル原画像情報の標本値領域又はディジ夕ル原画像情報 の周波数変換された周波数領域に不可視電子透かし情報を埋め込むた めの埋め込み位置とその位置の順序を予め暗号化して定めた埋め込み 制御情報と、電子透かし情報として意味を持ったディジタル音声 I D情 報の音声構成パラメ一夕を予め暗号化して定めた音声パラメ一夕情報 とを格納する。前記入力手段はディジタル原画像情報を入力する。前記 演算制御手段は、記憶手段から所要の暗号化音声パラメ一夕及び暗号化 埋め込み制御情報を選択し、選択した暗号化音声パラメータ及び暗号化 埋め込み制御情報を復号し、ディジ夕ル原画像情報の標本値領域又はデ ィジタル原画像情報の周波数変換された周波数領域に、前記復号された 音声パラメ一夕情報に従ったディジ夕ル音声 I D情報を、前記復号され た埋め込み制御情報で指定される位置と順序で埋め込むものである。 Further, encryption may be adopted for embedding the digital watermark information. That is, the information processing device includes an input unit, a storage unit, and an arithmetic control unit, and the storage unit stores a sample value region of the digitized original image information or a frequency-converted frequency region of the digitized original image information. The embedding control information for embedding the invisible digital watermark information in the video and the order of the positions are encrypted in advance and the embedding control information, and the voice configuration parameters of the digital voice ID information meaningful as the digital watermark information are specified in advance. Stores the encrypted audio parameter information. The input means inputs digital original image information. The arithmetic control means selects required encrypted voice parameters and encrypted embedded control information from the storage means, decodes the selected encrypted voice parameters and encrypted embedded control information, and decodes the digitized original image information. Sample value area or data Embedding digital audio ID information in accordance with the decoded audio parameter information in the frequency domain of the digital original image information in which the frequency has been converted, in the position and order specified by the decoded embedding control information It is.
I D情報の再生の観点に立った情報処理装置は、入力手段と演算制御 手段とを含み、 前記入力手段は、 ディジ夕ル原画像情報に I D情報とし ての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報と して分離可能に埋め込んだ情報を入力し、前記演算制御手段は前記入力 した情報から前記ディジ夕ル音声 I D情報を分離可能である。  An information processing apparatus from the viewpoint of reproducing ID information includes input means and arithmetic control means, and the input means includes a digital audio ID having a meaning as ID information in the original digital image information. Information in which information is separably embedded as invisible digital watermark information is input, and the arithmetic control means can separate the digital audio ID information from the input information.
I D情報の再生の観点に立った更に詳しい態様の情報処理装置は、入 力手段と演算制御手段とを含み、 前記入力手段は、 ディジタル原画像情 報の所定周波数成分に I D情報としての意味を持ったディジタル音声 I D情報が不可視電子透かし情報として埋め込まれた情報を入力する。 前記演算制御手段は、 前記入力した情報に対して周波数変換を行い、 前 記周波数変換で得られた情報から前記ディジ夕ル音声 I D情報を分離 可能である。  An information processing apparatus according to a more detailed aspect from the viewpoint of reproduction of ID information includes input means and arithmetic control means, and the input means has a predetermined frequency component of digital original image information having a meaning as ID information. Enter the information in which the digital voice ID information is embedded as invisible digital watermark information. The arithmetic control unit can perform frequency conversion on the input information, and can separate the digitized audio ID information from the information obtained by the frequency conversion.
前記演算制御手段は更に、前記分離したディジ夕ル音声 I D情報にノ ィズ除去フィル夕リ ング処理を行ってよい。フィル夕リ ング処理によつ て音声 I Dを認識し易くなる。前記演算制御手段は更に、 前記ノイズ除 去フィルタリング処理されたディジ夕ル音声ェ D情報に対して音声認 識処理を行ってよい。 音声 I Dの自動認識が可能になる。 図面の簡単な説明  The arithmetic control means may further perform a noise removal filtering process on the separated digitized audio ID information. The voice ringing process makes it easier to recognize the voice ID. The arithmetic control unit may further perform a speech recognition process on the digital audio data D that has been subjected to the noise removal filtering process. Automatic recognition of voice ID becomes possible. BRIEF DESCRIPTION OF THE FIGURES
第 1図はディジ夕ル原画像情報の標本値領域でディジ夕ル音声 I D 情報を電子透かしとして埋め込む方法の一例を示す説明図である。 第 2図はディジタル原画像情報の周波数領域でディジ夕ル音声 I D 情報を電子透かしとして埋め込む方法の一例を示す説明図である。 第 3図はディジ夕ル音声 I D情報の埋め込みの順序を規定する規則 を例示する説明図である。 FIG. 1 is an explanatory diagram showing an example of a method of embedding digital audio ID information as a digital watermark in a sample value area of digital original image information. FIG. 2 is an explanatory diagram showing an example of a method for embedding digital audio ID information as a digital watermark in the frequency domain of digital original image information. FIG. 3 is an explanatory view exemplifying a rule that specifies the order of embedding digital voice ID information.
第 4図は第 1図の手法で前記音声 I D情報を埋め込む処理手順を詳 細に示すフローチャートである。  FIG. 4 is a flowchart showing in detail a processing procedure for embedding the audio ID information by the method of FIG.
第 5図は第 1図の方法でディジ夕ル音声 I D情報が埋め込まれた画 像情報からディジ夕ル音声 I D情報と原画像情報とを分離する方法を 例示する説明図である。  FIG. 5 is an explanatory diagram illustrating a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
第 6図は第 2図の方法でディジ夕ル音声 I D情報が埋め込まれた画 像情報からディジ夕ル音声 I D情報と原画像情報とを分離する方法を 例示する説明図である。  FIG. 6 is an explanatory diagram exemplifying a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
第 7図には第 5図の手法で音声 I D情報と原画像情報とを分離する 処理手順を詳細に示すフローチヤ一トである。  FIG. 7 is a flowchart showing in detail the processing procedure for separating audio ID information and original image information by the method of FIG.
第 8図は電子承認、電子商取り引きに音声 I D情報による電子透かし 応用する場合を想定したときの情報処理システムの基本的な構成例を 示すブロック図である。  FIG. 8 is a block diagram showing a basic configuration example of an information processing system assuming a case where a digital watermark using audio ID information is applied to electronic approval and electronic commerce.
第 9図は第 8図の情報処理装置による処理の流れを詳しく示す説明 図である。  FIG. 9 is an explanatory diagram showing in detail the flow of processing by the information processing apparatus in FIG.
第 1 0図は画像情報に音声 I D情報による電子透かしを多重に埋め 込む場合の埋め込み場所を強調して示す説明図である。  FIG. 10 is an explanatory diagram showing emphasizing an embedding place when a digital watermark based on audio ID information is multiplexed into image information.
第 1 1図は電子承認書類、電子身分証明書類に音声 I D情報による電 子透かしを埋め込んだ例を示す説明図である。  FIG. 11 is an explanatory diagram showing an example of embedding an electronic watermark by voice ID information in an electronic approval document and electronic identification documents.
第 1 2図は音声 I D情報を用いた電子商取引又は電子承認システム の概要を例示するプロック図である。  FIG. 12 is a block diagram illustrating an outline of an electronic commerce or electronic approval system using voice ID information.
第 1 3図は音声 I D情報を電子透かしに利用する情報処理装置と、そ れを含んだ情報処理ネッ トワークの一例を示すブロック図である。 発明を実施するための最良の形態 FIG. 13 is a block diagram showing an example of an information processing device that uses voice ID information for digital watermarking, and an example of an information processing network including the information processing device. BEST MODE FOR CARRYING OUT THE INVENTION
第 1図にはディジ夕ル原画像情報の標本値領域でディジ夕ル音声 I D倩報を電子透かしとして埋め込む方法の一例が示される。第 2図には ディジ夕ル原画像情報の周波数領域でディジ夕ル音声 I D情報を電子 透かしとして埋め込む方法の一例が示される。  FIG. 1 shows an example of a method of embedding a digital audio ID as a digital watermark in a sample value area of digital original image information. Fig. 2 shows an example of a method for embedding digital audio ID information as a digital watermark in the frequency domain of digital original image information.
ディジ夕ル音声 I D情報 3を電子透かしとして原画像情報に埋め込 む場合、第 1図に示すように生の原画像情報 1そのものである標本値領 域に埋め込む場合と、 第 2図に示すように生の原画像情報 1に F F T (Fast Fourier Transform:高速フーリエ変換) などの周波数変換 (直 交変換) 処理 6を施して得られる周波数領域に埋め込む場合がある。 第 1図及び第 2図においてディジ夕ル音声 I D情報 3は予め音声パ ラメ一夕の相違する複数種類が用意されていて、選択手段 1 0 4で選択 されて、 透かし埋め込み処理 2、 7に供される。 前記ディジ夕ル音声 I D情報 3は、著作権の帰属などを示すためのアナログ音声を A/D変換 によって生成したディジ夕ル情報、或いはディジ夕ル的な音声合成によ つて生成されたディジ夕ル情報である。  When the digital audio ID information 3 is embedded in the original image information as a digital watermark, the digital audio ID information 3 is embedded in the sample value area, which is the raw original image information 1 itself, as shown in Fig. 1, and as shown in Fig. 2. As described above, the raw original image information 1 may be embedded in a frequency domain obtained by performing a frequency transformation (orthogonal transformation) process 6 such as FFT (Fast Fourier Transform). In FIG. 1 and FIG. 2, a plurality of types of digital voice ID information 3 having different voice parameters are prepared in advance, and are selected by the selection means 104, and are used for the watermark embedding processes 2 and 7. Provided. The digital audio ID information 3 is digital audio information generated by A / D conversion of analog audio for indicating copyright ownership or the like, or digital audio information generated by digital audio synthesis. Information.
第 2図の方法の場合には、埋め込み処理を経た情報は周波数変換処理 に対する逆変換処理 8が行われる。第 1図及び第 2図において 5、 9で 示されるものが、原画像情報にディジ夕ル音声 I D情報 3が電子透かし として埋め込まれて形成された情報である。  In the case of the method shown in FIG. 2, the information that has undergone the embedding process is subjected to an inverse conversion process 8 with respect to the frequency conversion process. In FIGS. 1 and 2, what is indicated by 5 and 9 is information formed by embedding digital audio ID information 3 in the original image information as a digital watermark.
前記埋め込み処理 2, 7では、 従来からの一般的な電子透かし埋め込 み位置決定アルゴリズムに従って、ディジ夕ル原画情報から音声 I D情 報の最適な埋め込み位置を決定することができる。 そのときは、極力原 画情報を変化させないようにされる。例えば、 透かし埋め込み処理 2の 場合、例えば輝度の高い画素の近傍にある輝度の低い画素は見えにくい というマスキング効果を利用し、輝度の高い画素近傍にディジ夕ル音声 I D情報を埋め込む。 また、 透かし埋め込み処理 7の場合、 例えば、 透 かし情報を埋め込んでも分かり難い高周波成分の多い部分にディジ夕 ル音声 I D情報を埋め込む。 In the embedding processes 2 and 7, the optimal embedding position of the voice ID information can be determined from the original digital image information according to a conventional general digital watermark embedding position determining algorithm. At that time, the original information is kept as small as possible. For example, in the case of the watermark embedding process 2, the masking effect that low-luminance pixels near high-luminance pixels are difficult to see is used, and digital audio Embed ID information. In the case of the watermark embedding process 7, for example, the digital audio ID information is embedded in a portion having many high-frequency components which is difficult to understand even when the watermark information is embedded.
ここで、 音声 I D情報 3のディジ夕ル音声構成パラメ一夕として、 音 声 a ( i )の量子化ビッ ト数、 サンプリング周期 (周波数) Δ ΐ、 音声デ —夕数 (時間) η , などを挙げることができる。 これらパラメ一夕は、 音声 I D情報に必要な透かし埋め込み強さによって、決定する必要があ る。 一般には、 量子化ビッ ト数を大きく、 サンプリング周期は小さく、 音声データ数は多い程、 音声 I D情報の透かしの強さを強くできる。即 ち、ノィズなどの外乱や情報落ちに対して情報の原形を保つことができ る能力が高いということである。  Here, as the parameters of the digitized voice configuration of voice ID information 3, the number of quantization bits of voice a (i), sampling period (frequency) Δΐ, voice data — number of evenings (time) η, etc. Can be mentioned. These parameters need to be determined according to the watermark embedding strength required for audio ID information. In general, the greater the number of quantization bits, the smaller the sampling period, and the greater the number of audio data, the stronger the watermark strength of audio ID information can be. In other words, the ability to maintain the original form of information against disturbances such as noise and information loss is high.
ディジ夕ル音声 I D情報の特徴として、埋め込まれた電子透かしとし てのディジ夕ル音声 I D情報を抜き出したとき、その音声の意味内容が、 人 (チェック者) により、 あるいは機械により認識 (意味理解) できさ えすればよい。 音質、 ノイズ度、 性別識別、 そして個性識別などの音声 品質は悪くても、 意味内容が理解できればよい。 例えば、 ディジ夕ル音 声構成パラメ一夕の一例として、 量子化ビッ ト数 4ビッ ト、 サンプリン グ周波数 4 k H ζ、 デ一夕時間 1秒から 2秒であった場合、 ディジ夕ル 音声 I D情報の透かしデータ量は、標本点で 5 0 0点〜 1 0 0 0点あれ ば充分である。 第 1図及び第 2図の例では、 所要の埋め込み強度、 埋め 込み情報の内容 (理解が容易な内容が否か等) に応じて、 それら音声パ ラメ一夕を可変させた、いくつかのディジ夕ル音声 I D情報 3を予め用 意しておき、 この中から、 適宜、 選択手段 4で選択するようになってい る。 即ち、 透かし埋め込み処理 2 , 7において埋め込み音声パラメ一夕 を決定し、これに従って前記選択手段 4でディジ夕ル音声 I D情報を選 択する。選択可能なディジタル音声 I D情報の音声パラメ一夕は全てが 選択可能でなくてもよい。音声データ数 nやサンプリング周期 A tを固 定にした数種類のディジ夕ル音声 I D情報を選択可能に用意するだけ でもよい。 As a feature of the digital voice ID information, when the digital voice ID information as an embedded digital watermark is extracted, the meaning of the voice is recognized by a person (checker) or by a machine (semantic understanding). You can do it. Even if the sound quality such as sound quality, noise level, gender identification, and individuality identification is poor, it is sufficient if the meaning can be understood. For example, as an example of a digital audio configuration parameter, if the number of quantization bits is 4 bits, the sampling frequency is 4 kHz, and the digital audio time is 1 second to 2 seconds, the digital audio It is sufficient that the watermark data amount of the ID information is 500 to 100 points at the sample points. In the examples of Fig. 1 and Fig. 2, several voice parameters were varied according to the required embedding strength and the content of the embedding information (whether or not the content was easy to understand). The digital voice ID information 3 is prepared in advance, and is selected by the selection means 4 as appropriate. That is, the embedding voice parameters are determined in the watermark embedding processes 2 and 7, and the digital voice ID information is selected by the selection means 4 in accordance with the parameters. All voice parameters of selectable digital voice ID information It need not be selectable. It is only necessary to prepare several types of digital audio ID information in which the number n of audio data and the sampling period At are fixed.
また、 前記埋め込み処理 2, 7の中では、 情報の埋め込み位置と共に、 ディジ夕ル音声 I D情報の埋め込みの順番、即ち情報の再生順序も決定 しておく。 埋め込み順序は第 3図に例示されるように横方向ジグザグ ( A )、 縦方向ジグザグ(B )、 横一方向(C )、 乱数による不規則(D ) 等を選択することができる。 例えば、 第 3図の (C ) をデータ順として 採用する場合、 ディジタル原画像情報を同図 (C ) の矢印方向に順次走 査するようにしながら、 埋め込み位置で、 ディジ夕ル音声 I D情報 (詳 しく述べるならば、ディジ夕ル音声 I D情報を構成するデータ列のデ一 夕ブロック単位) のデータを埋め込んで行く。  In the embedding processes 2 and 7, the embedding order of the digitized audio ID information, that is, the information reproduction order, is determined in addition to the information embedding position. As the embedding order, as shown in FIG. 3, a horizontal zigzag (A), a vertical zigzag (B), a horizontal zigzag (C), an irregular random number (D) and the like can be selected. For example, when (C) in FIG. 3 is adopted as the data order, the digital original image information is scanned sequentially in the direction of the arrow in FIG. To be more specific, the data of the data sequence that constitutes the digital voice ID information (data block units) is embedded.
透かし埋め込み処理 2、 7の後、 第 1図の場合は埋め込み処理 2から 得られた情報そのものが、音声 I D情報が埋め込まれた新たな情報 5と なり、 第 2図の場合は周波数変換処理 8された後の情報が、 音声 I D情 報が埋め込まれた新たな情報 9となる。第 1図及び第 2図の例では新た な情報 5、 9には電子透かし埋め込み位置の情報も鍵データとして含ま れている。特に図示はしないが、 量子化ビッ ト数などの音声 I D情報構 成パラメ一夕、そして埋め込み順序のデータも一緒に鍵デ一夕として含 めてもよい。音声 I D情報構成パラメ一夕を原画情報の状態に応じて算 出した場合、 また、 埋め込み位置も音声 I D情報構成パラメ一夕を原画 情報の状態に応じて算出した場合には、再生のためにそれら情報を前記 新たな情報 5、 9に含めておく必然性はない。再生時に再度演算で新た に取得することが可能だからである。そのような演算ができない場合に は、埋め込み位置や音声 I D情報構成パラメ一夕を鍵デ一夕として前記 情報 5、 9に含めてもよいが、 外部の認証機関などが鍵デ一夕を保管し、 必要に応じて入手することができる場合には、 鍵データを情報 5、 9に 一切含めなくてもよい。 また、 音声構成パラメ一夕、 埋め込み位置、 埋 め込み順序を一定のァルゴリズムにしたがって固定的に決定するよう な場合にはそれらの情報を鍵データに含めなくてもよい。 After the watermark embedding processes 2 and 7, in the case of FIG. 1, the information itself obtained from the embedding process 2 becomes new information 5 in which the voice ID information is embedded, and in the case of FIG. 2, the frequency conversion process 8 The information after this is the new information 9 in which the voice ID information is embedded. In the examples of FIGS. 1 and 2, the new information 5 and 9 also include the information of the digital watermark embedding position as key data. Although not specifically shown, the parameters of the voice ID information configuration such as the number of quantization bits and the data of the embedding order may be included together with the key data. If the parameters of the voice ID information configuration are calculated according to the state of the original image information, and the embedding positions are also calculated based on the parameters of the voice ID information configuration according to the state of the original image information, the reproduction is performed. It is not necessary to include such information in the new information 5 and 9. This is because it can be newly obtained by calculation again during playback. If such calculation is not possible, the embedding position and parameters of the voice ID information configuration may be included in the above information 5 and 9 as key data, but the external certificate authority or the like stores the key data. And If available, key information need not be included in information 5 or 9 at all. In addition, when the voice configuration parameters, embedding position, and embedding order are fixedly determined according to a certain algorithm, such information need not be included in the key data.
第 1図と第 2図の情報処理方法を比べると、 第 1図の場合は、 周波数 変換処理 6及びその逆変換処理 8がないから、演算処理ステツプ数が少 なくて済み、 処理が軽いことから、 動画などリアルタイム処理に都合が 良い。 第 2図の方法は周波数変換,逆変換がある分だけ、 第 1図の手法 に比べ、 埋め込まれた透かし情報の除去が複雑であり、 透かし情報の改 竄防止という点で優れている。周波数変換は F F Tに限定されず、 D C T ( Discrete cosine transform:離散コサイン変換) 等であってもよ い。  Comparing the information processing methods in Fig. 1 and Fig. 2, in Fig. 1, there are no frequency conversion processing 6 and its inverse conversion processing 8, so the number of operation processing steps is small and the processing is light. It is convenient for real-time processing such as video. The method shown in Fig. 2 is more complicated than the method shown in Fig. 1 in terms of frequency conversion and inverse conversion, and it is more complicated than the method in Fig. 1 in terms of preventing falsification of watermark information. The frequency conversion is not limited to FFT, but may be DCT (Discrete cosine transform) or the like.
第 4図には第 1図の手法で前記音声 I D情報を埋め込む処理手順が 詳細に示されている。 同図に示される処理は、 例えばコンビュ一夕本体、 ディスプレイ、 キーボードなどの入力装置、 及び外部と情報伝送可能な ィン夕フェース回路などを有するコンビュー夕装置のような情報処理 装置で実行可能な処理である。  FIG. 4 shows a detailed processing procedure for embedding the audio ID information by the method shown in FIG. The processing shown in the figure can be executed by, for example, an input device such as a convenience store main unit, a display, a keyboard, and an information processing device such as a display device having an interface circuit capable of transmitting information to the outside. Processing.
音声 I D情報埋め込み処理が指示されると、原画像情報に埋め込みた い透かし用のディジ夕ル音声 I D情報を設定する ( 1 1 ) 。 ここで行う 音声 I D情報の設定とは、ディジ夕ル音声 I D情報を原始的に作成する ことである。例えば、 著作権の帰属を示すような音声情報をアナログ音 声情報として生成し、 これを複数種類の音声パラメ一夕で、 ディジ夕ル 音声 I D情報に変換する。次に原画像情報を参照して音声 I D情報パラ メータを決定し、それに応じた埋め込み用のディジ夕ル音声 I D情報を 決定する ( 1 2 ) 。 例えば、 原画像のデータ密度や色調などの状態から、 どれだけ埋め込み可能なェリァがあるか、どの程度の透かし強度が必要 か、 等を考慮して、 所要の音声構成パラメ一夕を決定する。 さらに、 原 画情報から、前述のマスク効果を利用するというような予め決められた 所定のアルゴリズムに従ってディジ夕ル音声 I D情報を透かし情報と して埋め込む位置と順番とを決定する ( 1 3 ) 。 その後、 ディジ夕ル原 画像情報に透かし用のディジ夕ル音声 I D情報を前記順番に従って位 置を代えながら最後まで埋め込む処理を繰り返していく ( 1 4, 1 5 ) < 第 2図の方法を実現する場合は、 特に図示はしないが、 ステップ 1 4 の前に原画像情報を周波数変換する。 また、 埋め込みを全部完了した後 (ステップ 1 5の N oの後) に、 音声 I D情報が埋め込まれた情報に対 して周波数変換の逆変換を行う。それらの処理が第 4図の場合とは異な る ο When audio ID information embedding processing is instructed, digital audio ID information for a watermark to be embedded in the original image information is set (11). The setting of the voice ID information performed here is to create the digital voice ID information primitive. For example, audio information that indicates copyright ownership is generated as analog audio information, and this is converted into digital audio ID information using multiple types of audio parameters. Next, the voice ID information parameters are determined with reference to the original image information, and the corresponding digital voice ID information for embedding is determined (12). For example, depending on the data density and color tone of the original image, how much error can be embedded and how much watermark strength is required In consideration of the above, etc., the required voice composition parameters are determined. Further, the position and the order of embedding the digital audio ID information as the watermark information are determined from the original image information according to a predetermined algorithm such as using the above-mentioned mask effect (13). After that, the process of embedding the digital audio ID information for the watermark to the original image information to the end while changing the position according to the above order is repeated (14, 15) <The method of Fig. 2 is realized. In this case, although not shown, the original image information is frequency-converted before step 14. After all embedding is completed (after No in step 15), inverse conversion of frequency conversion is performed on the information in which the voice ID information is embedded. Their processing is different from that in Fig. 4 ο
第 5図には第 1図の方法でディジ夕ル音声 I D情報が埋め込まれた 画像情報からディジ夕ル音声 I D情報と原画像情報とを分離する方法 が示される。  FIG. 5 shows a method of separating the digital audio ID information and the original image information from the image information in which the digital audio ID information is embedded by the method of FIG.
ディジタル音声 I D情報が埋め込まれた画像情報からディジ夕ル音 声 I D情報と原画像情報とを分離する透かし取り出し処理 2 1は、画像 埋め込み時の埋め込み位置と埋め込み順に従って画像情報 5からディ ジ夕ル音声 I D情報を取り出す位置を取り出す位置を求めて、原画像情 報とディジ夕ル音声 I D情報とを分離する。  The watermark extraction process that separates the digital audio ID information from the original image information from the image information in which the digital audio ID information is embedded 21 is based on the image information 5 from the digital information 5 according to the embedding position and the embedding order when embedding the image. The position where the audio ID information is extracted is obtained, and the original image information is separated from the digital audio ID information.
ディジ夕ル音声 I D情報の埋め込み位置と埋め込み順は、例えば透か し埋め込み時に暗号化され鍵データとして、例えば情報 5に保管されて いるので、その鍵データを一定の規則に従って取り出して使用すれば、 取り出し位置と取り出しの順番が求められる。これによつて分離された ディジタル音声 I D情報は再生可能な状態に再構成されたことになる。 再構成されたディジ夕ル音声 I D情報は前記音声構成パラメ一夕に従 つて音声再生又は音声認識が可能になる。 埋め込み時に、原画像情報を用いて埋め込み位置と埋め込みの順番を 決定している場合に、直接埋め込み位置や順番を示す情報が画像情報 5 に含まれていなくても、再度原画像情報から埋め込み位置や順番を算出 し、算出された位置と順序に従って音声 I D情報の再構成を行えばよい ( このとき、 音声 a ( i )の量子化ビッ ト数、 サンプリング周期 (周波数) △ t、 音声データ数 (時間) n等の音声パラメ一夕は、 上記埋め込み位 置から埋め込み情報を取り除いた後の復元情報から埋め込み時に決定 した時と同様なアルゴリズムに従って得ることができる。 或いは、 予め 一定の規則が定められている場合にはそれに従うだけでよい。 The embedding position and embedding order of the digital voice ID information are, for example, encrypted at the time of watermark embedding and stored as key data, for example, in information 5, so if the key data is taken out and used according to certain rules The removal position and the removal order are required. As a result, the separated digital voice ID information is reconstructed in a reproducible state. The reconstructed digital voice ID information can be played back or recognized in accordance with the voice configuration parameters. When the embedding position and the order of embedding are determined using the original image information at the time of embedding, even if the information indicating the embedding position and order is not directly included in the image information 5, the embedding position is determined again from the original image information. And the order may be calculated, and the voice ID information may be reconstructed in accordance with the calculated position and order. (At this time, the number of quantization bits of the sound a (i), the sampling period (frequency) t, the number of sound data (Time) Speech parameters such as n can be obtained from the restoration information after removing the embedding information from the embedding position, according to the same algorithm as that determined at the time of embedding, or a predetermined rule is determined in advance. If so, you just have to follow it.
第 5図において 2 2で示されるものは分離されたディジ夕ル音声 I In Fig. 5, what is indicated by 2 2 is the separated digital audio I
D情報、 2 3で示されるものは分離された原画像情報の復元情報である ( 第 6図には第 2図の方法でディジ夕ル音声 I D情報が埋め込まれた 画像情報からディジ夕ル音声 I D情報と原画像情報とを分離する方法 が示される。この場合には、埋め込み処理における F F T (Fast Fourier Transform) などの周波数変換処理 6と同等の周波数変換処理 2 6を、 電子透かしが埋め込みまれた前記情報 9に施し、 その後で、 前記同様の ァルゴリズムで透かし取り出し処理 2 4が行われる。ここでの復元信号 2 3は、埋め込み位置に対応して存在していた埋め込み情報を取り除い た後に、その情報を前記周波数変換 2 6に対する逆変換 2 8を行って得 られる。 The information indicated by D information and 23 are the restoration information of the separated original image information ( Fig. 6 shows the digital audio data from the image information in which the digital audio ID information is embedded by the method of Fig. 2). In this case, a method for separating the ID information from the original image information is shown.In this case, the frequency conversion processing 26 equivalent to the frequency conversion processing 6 such as FFT (Fast Fourier Transform) in the embedding processing is performed, and the digital watermark is embedded. After that, a watermark extraction process 24 is performed using the same algorithm as the above-described information 9. The restoration signal 23 here is obtained by removing the embedded information corresponding to the embedding position. The information is obtained by performing an inverse transform 28 on the frequency transform 26.
第 7図には第 5図の手法で音声 I D情報と原画像情報とを分離して 更に再生する処理手順が詳細に示されている。同図に示される処理は、 例えばコンビユー夕本体、 ディスプレイ、 キ一ボードなどの入力装置、 及び外部と情報伝送可能なイン夕フェース回路などを有するコンビュ —夕装置のような情報処理装置で実行可能な処理である。  FIG. 7 shows in detail a processing procedure for separating the audio ID information and the original image information and further reproducing the same by the method shown in FIG. The processing shown in the figure can be executed by an input device such as a display unit, a display, a keyboard, and an information processing device such as a display device having an interface circuit capable of transmitting information to the outside. Processing.
音声 I D情報の分離処理が指示されると、予め決められた透かし取り 出し処理アルゴリズムにしたがって透かし用の I D音声デ一夕を埋め 込んだ位置を示すデータを前記鍵デ一夕から取得する ( 3 1 ) 。 その位 置に基づいて復元デ一夕と音声 I Dデータとを仮に分離する ( 3 2 )。 この分離時にはディジタル音声 I D情報の順序は考慮されていないも のとする。第 7図の処理では、 分離された復元データから音声パラメ一 夕を決定する ( 3 3 ) 。 ディジタル音声 I D情報のデータの順序は埋め 込みの段階で予め一定の順序か決められているものとする。その順序と 同じ順序を用い、前記音声構成パラメ一夕の量子化ビッ ト数を参照して、 前記ステツプ 3 2で仮に分離されたデ一夕からディジ夕ル音声 I D情 報を再構成する ( 3 4 ) 。 ディジタル音声 I D情報が再構成されて完全 に分離されると、分離されたディジ夕ル音声 I D情報に対してノィズ除 去を行う ( 3 5 ) 。 第 7図では、 分離されノイズ除去されたディジタル 音声 I D情報に対して音声再生処理を行って、ディジ夕ル音声 I D情報 の意味を摘出 ·若しくは認識する ( 3 6 ) 。 音声 I D情報が多重に埋め 込まれている場合には、 ステツプ 3 7を経て、 ステツプ 3 3〜ステツプ 3 6の処理が全ての音声 I D情報に対して行われる。 When voice ID information separation processing is instructed, a predetermined watermark removal is performed. Data indicating the position where the ID voice data for watermarking is embedded is obtained from the key data in accordance with the output processing algorithm (31). Based on the position, the restoration data is temporarily separated from the voice ID data (32). At the time of this separation, the order of the digital voice ID information is not considered. In the process of FIG. 7, the voice parameters are determined from the separated restored data (33). It is assumed that the order of the data of the digital voice ID information is determined in advance at the embedding stage. Using the same order as above, refer to the number of quantization bits in the speech configuration parameter and reconstruct the digital speech ID information from the data temporarily separated in step 32 ( 3 4). When the digital voice ID information is reconstructed and completely separated, noise removal is performed on the separated digital voice ID information (35). In FIG. 7, voice reproduction processing is performed on the separated digital noise ID information from which noise has been removed, and the meaning of the digital voice ID information is extracted or recognized (36). If the voice ID information is multiplexed and embedded, the process of steps 33 to 36 is performed for all voice ID information via step 37.
第 6図の方法を実現する場合は、 特に図示はしないが、 ステップ 3 2 の前に情報 9を周波数変換する。 また、 ステップ 3 2で分離された原画 像情報に応ずる憒報に対して周波数変換の逆変換を行って復元情報を 得る。 それらの処理が第 7図の場合とは異なる。  In the case of realizing the method of FIG. 6, although not particularly shown, the frequency of the information 9 is converted before step 32. In addition, the information corresponding to the original image information separated in step 32 is subjected to inverse conversion of frequency conversion to obtain restoration information. Their processing is different from that of FIG.
第 8図には前記方法を電子承認、 電子商取り引き、 情報配信等のビジ ネスに応用する場合を想定したときの情報処理システムの基本的な構 成例を示す。  Fig. 8 shows a basic configuration example of an information processing system assuming that the above method is applied to businesses such as electronic approval, electronic commerce, and information distribution.
第 8図において 4 0, 4 1は夫々情報処理装置である。第 1の情報処 理装置 4 0は、 I D情報として意味を持ったディジ夕ル音声 I D情報が 不可視電子透かし情報としてディジ夕ル原画像情報の周波数領域に分 離可能に埋め込まれた画像情報を出力可能である。第 2の情報処理装置 4 1は、 第 1の情報処理装置 4 0が出力する前記画像情報を入力し、 入 力した前記画像情報を電子的な承認情報、或いは署名情報などに用いる ものである。 そして、 第 2の情報処理装置 4 1は、 画像情報からディジ タル音声 I D情報を分離し、 これを再生することにより、 その画像情報 が真正であるか否かをチェック可能になっている。 In FIG. 8, reference numerals 40 and 41 denote information processing devices, respectively. The first information processing device 40 separates the digitized audio ID information having meaning as ID information into the frequency domain of the original digitized image information as invisible digital watermark information. It is possible to output image information embedded detachably. The second information processing device 41 inputs the image information output by the first information processing device 40 and uses the input image information as electronic approval information or signature information. . Then, the second information processing apparatus 41 separates the digital audio ID information from the image information and reproduces the digital audio ID information, thereby making it possible to check whether or not the image information is genuine.
第 1の情報処理装置 4 0がディジタル音声 I D情報を画像情報に埋 め込む処理は第 2図の処理と同じである。ディジ夕ル音声 I D情報 3が 電子透かしとして埋め込まれた情報 9は、一般的な電子ビジネスでは、 M P E G ( Moving Picture Experts Group ) 、 J P E G ( Joint Photographic Experts Group) などとして規格化された符号化 (圧縮) 処理 (4 2 ) が施され、 さらに、 その符号化デ一夕はネッ トワーク上を 伝送する前に暗号化 ( 4 4 ) される。 暗号化データが伝送される伝送経 路は、埋め込まれた透かし情報に除去や破壊といった作用を与える外乱 になる。  The process in which the first information processing device 40 embeds digital audio ID information in image information is the same as the process in FIG. In the general electronic business, information 9 in which digital audio ID information 3 is embedded as a digital watermark is encoded (compressed) standardized as MPEG (Moving Picture Experts Group), JPEG (Joint Photographic Experts Group), etc. ) The processing (42) is performed, and the encoded data is encrypted (44) before being transmitted on the network. The transmission path through which the encrypted data is transmitted is a disturbance that has the effect of removing or destroying the embedded watermark information.
透かし情報として埋め込まれた音声 I D情報は、それら外乱に対抗し 得る強度が必要がある。 音声 I D情報は、 前述のようにその強度は、 量 子化ビッ ト数、サンプリング周波数などの音声パラメ一夕に応じて可変 であるから、 必要な強度を容易に設定でき、 また、 音声 I D情報の欠落 によって音声品質が悪くなっても最終的に音声の意味を認識ができる 程度の情報量を持てばよく、 音声データの場合には例えば情報量が 5 0 %欠落しても通常その意味を理解できるから、外乱に対する強度とい う点では、 音声情報としての性質上、 優れている。 また、 音声時間も 1 〜 3秒もあれば、 2 0〜 5 0語句位の再生が可能であり、 I Dを証明す る量としては、 充分である。強度を増すにしたがって埋め込み情報量が 増え、これに従って原画像情報に対する不所望な劣化の度合いが進み、 双方トレードオフの関係になるになるが、音声パラメ一夕を変えれば簡 単に透かし強度を増すことができ、 また、 音声品質を上げることが容易 であり、 必要に応じて高品位の音声識別も可能になる。 The voice ID information embedded as watermark information needs to be strong enough to resist such disturbances. As described above, the strength of the voice ID information is variable according to the voice parameters such as the number of quantization bits and the sampling frequency, so that the required strength can be easily set. Even if voice quality deteriorates due to lack of data, it is sufficient to have enough information to finally recognize the meaning of voice.In the case of voice data, for example, even if 50% Because it can be understood, it is excellent in terms of the strength against disturbance in terms of the sound information. In addition, if the voice time is also 1 to 3 seconds, it is possible to reproduce 20 to 50 words and phrases, which is enough to prove the ID. As the intensity increases, the amount of embedded information increases, and the degree of undesired deterioration with respect to the original image information increases accordingly. Although there is a trade-off between the two, the watermark strength can be easily increased by changing the audio parameters, and it is easy to improve the audio quality. Will be possible.
伝送路を介して伝達された暗号データは暗号復号化処理が施され(4 7 ) 、 更に、 第 2の情報処理装置 4 1において、 M P E G符号に対する 復号化 (伸長) 処理が行われる (4 8 ) 。 複合化された情報が前記電子 透かしの埋め込まれた画像情報 9に対応される。その画像情報を電子承 認或いは電子署名に利用する場合、その画像情報が真正であるか否かは、 可視情報のレベル、更には不可視電子透かしのレベルで判定することに なる。例えば、 不可視電子透かしのレベルで判定しなければならないと きは、 第 6図に基いて説明したように、 周波数変換処理 2 6及び透かし 取り出し処理 2 4を介して音声 I D情報を分離し、分離した情報からフ ィルタリング処理でノィズを除去し ( 5 0 ) 、 音声認識処理を行い ( 5 1 )、 I Dの音声の意味を認識すればよい。 尚、 埋め込み前の音声 I D 情報自体が符号化されていた場合には、 2 2の情報を復号化して音声 I D情報を取得することになる。  The encrypted data transmitted via the transmission path is subjected to encryption / decryption processing (47), and further, the second information processing device 41 performs decryption (decompression) processing for the MPEG code (48). ). The composited information corresponds to the image information 9 in which the digital watermark is embedded. When the image information is used for electronic approval or digital signature, whether or not the image information is genuine is determined at the level of the visible information and further at the level of the invisible digital watermark. For example, when it is necessary to determine at the level of the invisible digital watermark, as described with reference to FIG. 6, the voice ID information is separated through the frequency conversion process 26 and the watermark extraction process 24, and separated. The noise may be removed from the obtained information by filtering (50), voice recognition may be performed (51), and the meaning of the voice of the ID may be recognized. If the audio ID information before embedding is encoded, the information of 22 is decoded to obtain the audio ID information.
尚、参考として第 9図には第 8図の情報処理装置による処理の流れを 詳しく示してある。  For reference, FIG. 9 shows the flow of processing by the information processing apparatus in FIG. 8 in detail.
第 1 0図には画像情報に電子透かしとして音声 I D情報を多重に埋 め込んだ場合の埋め込み場所を強調して示してある。  FIG. 10 highlights the embedding location when audio ID information is multiplexed and embedded as a digital watermark in image information.
原画情報 6 0の権利関係が多岐に亘る場合には、夫々の権利者が夫々 固有の I D情報を埋め込む場合が想定される。領域 6 1は著作権者の音 声 I D情報領域、 6 2は原画情報の所有権者の音声 I D情報領域、 6 3 は原画情報の許諾複製権保有者の音声 I D情報領域である。夫々の領域 には夫々固有のディジ夕ル音声 I D情報が電子透かし情報として次々 と埋め込まれることになる。 6 1 D、 6 2 D、 6 3 Dは夫々のディジ夕 ル音声 I D情報の素片データを意味する。 If the rights relationship of the original image information 60 is diversified, it is assumed that each right holder embeds its own ID information. Area 61 is a voice ID information area of the copyright holder, 62 is a voice ID information area of the owner of the original image information, and 63 is a voice ID information area of the authorized duplication right holder of the original image information. In each area, unique digital voice ID information is embedded one after another as digital watermark information. 6 1D, 6 2D, 6 3D Means the segment data of the voice ID information.
第 1 0図の例では、 前の埋め込み重心は分かっているので、 そこから ずらした新たな埋め込み重心に、次の電子透かし情報の埋め込みを行う ようになつている。 すなわち、 各領域 6 1, 6 2, 6 3において、 埋め 込まれた音声 I D情報の位置デ一夕から各領域の重心位置が決定でき る。新たにディジ夕ル音声 I D情報を埋め込むときは、 既存の重心位置 からずれるように、 新たな埋め込み位置を決定すればよい。各領域の重 心位置については、その履歴若しくは最新位置を格納する場所を前記情 報 5 , 9の中に確保してもよい。  In the example of FIG. 10, since the previous embedding center is known, the next digital watermark information is embedded in a new embedding center shifted from that. That is, in each of the regions 61, 62, and 63, the position of the center of gravity of each region can be determined from the position data of the embedded audio ID information. When newly embedding digital audio ID information, a new embedding position may be determined so as to deviate from the existing center of gravity position. As for the center of gravity position of each area, a place for storing the history or the latest position may be secured in the information 5 or 9.
第 1 1図には電子承認書類、電子身分証明書類に音声 I D情報による 電子透かしを埋め込んだ例が示される。  Fig. 11 shows an example of embedding a digital watermark with audio ID information in electronic approval documents and electronic identification documents.
電子承認書類 7 0において不正コピーされ易いディジ夕ル承認印 7 1 , 7 2の陰影画像情報が原画像情報とされ、 そこに、 音声 I D情報が ディジタル透かしデータとして埋め込まれている。  In the electronic approval document 70, the shadow image information of the digital approval marks 71 and 72, which are liable to be illegally copied, is used as the original image information, in which audio ID information is embedded as digital watermark data.
また、 電子身分証明書 7 3では、 不正コピーされ易いディジ夕ル顔写 真 8 4が原画像情報とされ、 そこに、 音声 I D情報がディジ夕ル透かし データとして埋め込まれている。  Also, in the electronic identification card 73, the digitized face photo 84, which is liable to be illegally copied, is used as the original image information, in which audio ID information is embedded as digitized watermark data.
第 1 2図には音声 I D情報を用いた電子商取引又は電子承認システ ムの概要が例示される。  Figure 12 illustrates an overview of an electronic commerce or electronic approval system using voice ID information.
先ず、 著作物制作部署 8 1において、 著作権者 8 2が自分だけが知り 得る音声 I D倩報 8 3を作成する。この音声 I D情報 8 3を商品販売部 署 (或いは電子承認作成部署) 8 4において、 著作権者 8 2が制作した 著作権情報 A 8 5に、第 1図又は第 2図による埋め込み方法によってデ ィジ夕ル音声 I D情報 8 3を埋め込むためのプログラム 8 6を使って、 音声 I D情報を埋め込んだ新たな著作権情報 B 8 7を得る。この著作権 情報 B 8 7が、 承認物或いは情報は配信物として使用される。 この著作 権情報 B 8 7が正規物か不正利用物かのチェックが必要になった場合、 そのチェック機関 9 0において、 不正判定を行う。 この場合、 チェック 対象著作物 B 8 8に、第 5図又は第 6図による音声 I D情報検出及び分 離方法によってディジ夕ル音声 I D情報を取り出すための検出プログ ラム 9 1を通して、 音声 I D情報 9 2の検出を行い、 これと予め管理部 9 3のデータベース 9 6に登録された音声 I D情報との比較照合 9 4 を行い、 合否を決定する。 First, in the copyright production department 81, the copyright owner 82 creates a sound ID information bulletin 83 that only the user can know. The voice ID information 83 is added to the copyright information A 85 produced by the copyright holder 82 by the embedding method shown in FIG. 1 or FIG. 2 in the product sales department (or electronic approval creation department) 84. Using a program 86 for embedding the voice ID information 83, new copyright information B 87 with the voice ID information embedded is obtained. This copyright information B877 is used as an approved product or information as a distribution product. This work When it is necessary to check whether the right information B 87 is a genuine or illegally used item, the checking organization 90 performs an illegal judgment. In this case, the copyrighted work B 88 to be checked is passed through the detection program 91 for extracting the digital voice ID information by the voice ID information detection and separation method shown in FIG. 5 or FIG. 2 is detected and compared with the voice ID information registered in advance in the database 96 of the management unit 93, and the pass / fail is determined.
なお、 ここでは、 著作権者 8 2が音声 I D情報 8 3を作成する構成に しているが、 必ずしも著作権者 8 2が作成する必要はなく、 著作権管理 者 9 5等が作成してもよい。 また、 管理部署 9 3と制作部署 8 1とチェ ック部署 9 0を各々分離した構成にしているが、同様に必ずしも分離し て構成しなくてもよく、 例えば 3部署一体部署であっても、 また、 何れ か 2部署一体部署であってもよい。  Here, the copyright holder 82 creates the voice ID information 83, but it is not always necessary for the copyright owner 82 to create it, and the copyright manager 95 creates it. Is also good. In addition, although the management department 93, the production department 81, and the check department 90 are configured separately, they need not necessarily be configured separately. Alternatively, any two departments may be integrated.
尚、 一般に、 電子透かしは、 そのままでは不正コピーが意味をもたな い暗号化された情報が万一、 不正に解読され、 復元情報として、 不正に コピーされた場合に、著作権者だけが持つオリジナル情報との比較によ つて、 電子透かし情報を取り出し、 著作権を主張、 あるいは、 不正コピ 一者を特定するためのものである。従って、 第 8図及び第 9図に示した 音声 I D透かし情報の取り出し処理は、一般的には不正の調査時にしか 使用されない場合が多い。取り出し処理は、 例えば透かし管理会社など が行う。  In general, a digital watermark is used only by the copyright holder in the event that encrypted information that would otherwise have no meaning in an unauthorized copy is decrypted in an unauthorized manner and that is illegally copied as restoration information. It is for extracting digital watermark information by comparing it with original information possessed, claiming copyright, or identifying an unauthorized copyist. Therefore, the audio ID watermark information extraction processing shown in FIGS. 8 and 9 is often used only when investigating fraud. The extraction process is performed by, for example, a watermark management company.
尚、 第 1 2図において、 商品販売部署 (或いは電子承認作成部署) 8 4を第 1の情報処理装置、チエツク機関 9 0を第 2の情報処理装置と見 なすことができる。第 1の情報処理装置 8 4は音声 I D埋め込みプログ ラム 8 6を実行して、著作権情報 A 8 5にディジタル音声 I D情報を生 め込んだ著作権情報 B 8 7を生成して、 出力する。第 2の情報処理装置 90は、 著作権情報 B 87を入力し、 必要に応じて当該情報 87からデ ィジタル音声 I D情報 92を抽出し、 これを、 データベースに予め登録 されている真正著作権者の I D情報と比較可能にする。 In FIG. 12, the product sales department (or electronic approval creation department) 84 can be regarded as a first information processing device, and the check organization 90 can be regarded as a second information processing device. The first information processing device 84 executes the audio ID embedding program 86 to generate and output the copyright information B 87 that incorporates the digital audio ID information into the copyright information A 85 . Second information processing device 90 inputs the copyright information B 87, extracts the digital voice ID information 92 from the information 87 as necessary, and can compare this with the ID information of the genuine copyright holder registered in the database in advance To
第 1 3図には前記音声 I D情報を電子透かしに利用する情報処理装 置 (コンビユー夕装置とも称する) と、 それを含んだ情報処理ネッ トヮ ークの一例が示される。  FIG. 13 shows an example of an information processing device (also referred to as a combination device) that uses the audio ID information for digital watermarking, and an example of an information processing network including the device.
第 13図に示される情報処理ネッ トヮ一クは LAN (ローカル ·ェリ ァ 'ネッ トワーク) 、 イン夕一ネッ トなどの W A N (ワイ ド ·エリア - ネッ トワーク) 、 無線通信ネッ トワークなどのシステムであり、 104 で示されるものがそのシステムにおける光ファイバ、 I SDN回線、 又 は無線回線などの伝送媒体を意味している。伝送媒体 104には、 特に 制限されないが、 ホス トコンピュータ装置 103、 ルー夕やターミナル アダプタ等の通信アダプタ 105 , 106, 107を介して代表的に示 された端末コンピュータ装置 100, 10 1, 102が接続されている c 端末コンピュータ装置 100、 特に制限されないが、 半導体集積回路 化されたデ一夕プロセッサ (MP U) 109を有し、 外部バス 108に は表示コン トローラ (D I SPC) 1 13、 ネッ トワークコン トローラ (NE T C) 1 14、 及び DRAM 1 15が接続され、 また、 デ一夕プ 口セッサ 1 09に内蔵された図示を省略する周辺回路に接続されたフ 口ッピ一ディスクコン トローラ (FDC) 1 10、 キーボードコン ト口 —ラ (KEY C) 1 1 1、 及びィンテグレーテツ ド ·デバイス 'エレク トロニクス 'コントローラ (I DE C) 1 12が設けられている。 D I S P C 1 13はビデオ RAM (VRAM) 12 1に描画制御を行い、 描 画した表示デ一夕をディスプレイ (D I S P) 120に表示制御する。 NE T C 1 14は通信アダプタ 105に接続され、送受信情報のバヅフ ァリング及び通信プロ トコル制御等を行う。 DRAM 1 15はデ一夕プ 口セッサ 109のプログラム領域及びワーク領域などに利用される。前 記 F D C 1 10にはフロヅビーディスク ドライブ装置 1 1 6が接続さ れ、記録媒体の一例であるフ口ッピーディスク 130から情報を読み取 り、 また、 情報を書込む。 KE YC 1 1 1にはキーボード 1 17が接続 される。 IDE C 1 12にはハードディスク ドライブ装置 (HDD) 1 18、 CD— ROMドライブ装置 (CDRD) 1 19が接続される。 H DD 1 18は記録媒体の別の例である磁気ディスクを有する。 CD RD 1 1 9は記録媒体の更に別の例である CD— ROM 131を有する。尚、 その他の端末コンピュータ装置 101 , 102も上記同様に構成される c 例えば前記端末コンピュータ装置 100を用いて、第 1図又は第 2図 のディジ夕ル音声 I D情報埋め込み処理を行う場合、或いは第 5図又は 第 6図のディジタル音声 I D情報取り出し処理を行う場合、そのための プログラムは、 例えば、 ユーザによってフロッビ一ディスク 130や C D - OM 13 1からハードディスク ドライブ装置 1 18にインス ト ールされる。 このとき、 フロッビ一ディスク 130や CD— ROM13 1には前記プログラムが予め記録されている。端末コンビュ一夕装置 1 00のセッ トメ一力がそのプログラムをハードディスク ドライブ装置 にプリインス トールして提供する場合もある。 The information processing network shown in Fig. 13 is a LAN (local area network), a WAN (wide area-network) such as an Internet network, and a wireless communication network. A system, and the one indicated by 104 indicates a transmission medium such as an optical fiber, an ISDN line, or a wireless line in the system. The transmission medium 104 includes, but is not limited to, a host computer device 103, and terminal computer devices 100, 101, and 102, typically shown via communication adapters 105, 106, and 107 such as routers and terminal adapters. The connected c-terminal computer device 100 includes, but is not limited to, a semiconductor integrated circuit processor (MPU) 109, and a display controller (DI SPC) 113 on an external bus 108, and a network. Network controller (NE TC) 114 and DRAM 115, and a disk controller connected to a peripheral circuit (not shown) built in the data processor 109. (FDC) 110, keyboard controller (KEY C) 111, and integrated device 'electronics' controller (IDEC) 112 are provided. The DISPC 113 controls drawing on the video RAM (VRAM) 121, and displays the drawn display data on the display (DISP) 120. The NE TC 114 is connected to the communication adapter 105, and performs transmission / reception information buffering and communication protocol control. DRAM 1 15 It is used for the program area and work area of the mouth processor 109. The FDC 110 is connected to a floppy disk drive device 116 to read information from and write information to a floppy disk 130 which is an example of a recording medium. A keyboard 117 is connected to the KEYC 111. A hard disk drive (HDD) 118 and a CD-ROM drive (CDRD) 119 are connected to IDE C 112. The HDD 118 has a magnetic disk which is another example of the recording medium. The CD RD 119 has a CD-ROM 131 which is another example of the recording medium. The other terminal computer devices 101 and 102 have the same configuration as above.c For example, when the digital audio ID information embedding process of FIG. 1 or FIG. 2 is performed using the terminal computer device 100, or When the digital voice ID information extraction processing shown in FIG. 5 or FIG. 6 is performed, a program for that is installed from the floppy disk 130 or the CD-OM 131 into the hard disk drive 118 by a user, for example. At this time, the program is recorded in the floppy disk 130 and the CD-ROM 131 in advance. In some cases, the set-up of the terminal computer 100 may provide the program preinstalled on the hard disk drive.
データプロセッサ 109はインストールされたプログラムを実行するとき、 そのプログラムを DRAM 115にロードし、 D RAM 115から順次命令 をフェッチして実行する。 尚、 CD— ROM 131に格納されているプログ ラムの一部を直接 CD— ROMから取り出して実行することも可能である。 これにより、 前記端末コンビユー夕装置 100は、 フロッピーディスク 130等を介して前記プログラムをィンス トールでき、或いはハードデ イスク ドライブ装置 1 18等から直接前記プログラムを実行できる。よ つて、 前記プログラムを格納した記録媒体 130, 13 1によれば、 原 画像情報に対する相対的な埋め込み情報量が少なく、埋め込まれた原情 報への外乱に対しても、 また、 情報欠落に対しても強い、 電子透かしに よる I D情報の埋め込みと埋め込まれた I D情報の分離とを、情報処理 装置の一例である端末コンピュータ装置 1 0 0に、簡単に実現させること ができる。 When executing the installed program, the data processor 109 loads the program into the DRAM 115, and sequentially fetches and executes instructions from the DRAM 115. A part of the program stored in the CD-ROM 131 can be directly taken out from the CD-ROM and executed. Thus, the terminal combination device 100 can install the program via the floppy disk 130 or the like, or can execute the program directly from the hard disk drive device 118 or the like. Therefore, according to the recording media 130 and 131 storing the program, the original Embedding of ID information by digital watermarking and embedded ID information, which has a small amount of embedded information relative to image information and is strong against disturbance to embedded original information and against missing information This can be easily realized by the terminal computer 100 as an example of the information processing device.
また、 端末コンビュ一夕装置 1 0 0はホストコンピュータ装置 1 0 3から 前記プログラムをダウンロードすることができる。 即ち、 ホストコンビユー 夕装置 1 0 3は、 例えば圧縮された前記プログラムをハードディスク装置な どに保有している。 端末コンビュ一夕装置 1 0 0がホストコンピュー夕装置 1 0 3と通信を確立したあと、 端末コンピュータ装置 1 0 0がそのプログラ ムを指定押してダウンロードを指示することにより、 前記プログラムが伝送 媒体 9 4に伝送されて、 端末コンビユー夕装置 1 0 0のハードディスクドラ イブ装置 1 1 8にダウンロードされる。 ダウンロードされたプログラムは、 その後、 伸長されて、 所定のプログラム格納エリアにインストールされる。 このように、前記端末コンピュータ装置 1 0 0は前記伝送媒体 9 4を 介して前記ディジ夕ル音声 I D情報の埋め込み又は分離プログラムを ネヅ トワーク上で簡単に取得できる。 したがって、 その伝送媒体 9 4は、 前述のように原画像情報に対する相対的な埋め込み情報量が少なく、埋 め込まれた原情報への外乱に対しても、 また、 情報欠落に対しても強い、 電子透かしによる I D情報の埋め込みと埋め込まれた I D情報の分離 とを、 端末コンピュータ装置 1 0 0に、 簡単に実現させることに役立つ。 前記フロッピーディスク 1 3 0及び C D— R O M 1 3 1等の記録媒 体は、原画像情報にディジ夕ル音声 I D情報を電子透かし情報として埋 め込んだ前記画像情報を記録して流通させる媒体でもある。 また、 伝送 媒体 1 0 4は、原画像情報にディジ夕ル音声 I D情報を電子透かし情報 として埋め込んだ前記画像情報を情報処理装置 1 0 0等の情報処理装 置に伝送する媒体でもある。 Further, the terminal-viewing device 100 can download the program from the host computer device 103. In other words, the host convenience device 103 holds, for example, the compressed program in a hard disk device or the like. After the terminal computer 100 establishes communication with the host computer 103, the terminal computer 100 designates the program and instructs the download, whereby the program is transmitted to the transmission medium 94. And transmitted to the hard disk drive 1118 of the terminal combination device 100. The downloaded program is then decompressed and installed in a predetermined program storage area. In this manner, the terminal computer 100 can easily acquire the program for embedding or separating the digital audio ID information via the transmission medium 94 on a network. Therefore, the transmission medium 94 has a small amount of embedded information relative to the original image information as described above, and is strong against disturbance to the embedded original information and also against information loss. It is useful for the terminal computer 100 to easily realize embedding of the ID information by the digital watermark and separation of the embedded ID information. The recording medium such as the floppy disk 130 and the CD-ROM 131 is a medium for recording and distributing the image information in which the digital audio ID information is embedded in the original image information as digital watermark information. is there. In addition, the transmission medium 104 can store the image information obtained by embedding the digital audio ID information in the original image information as digital watermark information in an information processing apparatus such as an information processing apparatus 100. It is also a medium for transmission to devices.
上記記録媒体 1 3 0 , 1 3 1に記録された画像情報、 或いは伝送媒体 1 0 4を介して伝送される画像情報を入力する情報処理装置 1 0 0等 は、 必要に応じて、 電子透かし情報としてのディジ夕ル音声 I D情報を 分離してその内容を確認する処理をローコス トの情報処理装置に対し ても比較的容易に実現させることを可能にする。  The information processing apparatus 100 or the like that inputs the image information recorded on the recording mediums 130 and 131 or the image information transmitted via the transmission medium 104 is provided with a digital watermark as necessary. It is possible to relatively easily realize the process of separating the digital voice ID information as information and confirming the content even in a low-cost information processing device.
上記説明したディジ夕ル音声 I D情報を電子透かしとして埋め込み、 また、それを分離して再生可能にする種々の実施の形態による作用効果 を整理すれば、 以下の通りである。  The effects of the various embodiments for embedding the digital audio ID information described above as a digital watermark and separating and embedding the digital watermark for reproduction are summarized as follows.
1〕ディジタル音声 I D情報を I D情報として意味を持つ電子透かし 情報に採用するから、 情報欠落が例えば 5 0 %あっても、 意味として認 識でき、 外乱に対して強い透かし強度を簡単に実現できる。 また、 量子 化ビッ ト数、サンプリング周波数など容易に音声パラメ一夕を変化させ ることによって、要求に応じた外乱への強度を所望に選択することが簡 単である。  1) Digital voice ID information is adopted as digital watermark information having meaning as ID information, so even if there is information loss of, for example, 50%, it can be recognized as meaning and watermark strength strong against disturbance can be easily realized. . In addition, it is easy to select a desired level of disturbance according to the demand by easily changing the audio parameters such as the number of quantization bits and the sampling frequency.
〔 2〕ディジタル音声 I D情報の場合、 再生して取り出した音声の品質 は、 意味がわかる最低の品質でもよいから、 例えば、 量子化ビッ ト数が 4ビッ ト、 サンプリング周波数 4 k H zの場合など、 透かしデータ量は、 1 6 kビッ 卜/秒 (非圧縮) の少ないデータ量でも充分である。 再生時 間は 1秒乃至 3秒もあれば、 I D情報としての用途には充分である。 [2] In the case of digital audio ID information, the quality of the audio reproduced and extracted may be the lowest quality that can be understood.For example, when the number of quantization bits is 4 bits and the sampling frequency is 4 kHz, For example, a watermark data amount of 16 kbit / s (uncompressed) is sufficient. A reproduction time of 1 to 3 seconds is enough for use as ID information.
〔 3〕ディジ夕ル音声 I D情報を I D情報として意味を持つ電子透かし 情報に採用するから、取り出した音声 I D透かし情報にノイズが含まれ ていても、 意味認識さえできれば良く、 意味認識がやや困難な状態であ つても、 フィルタリング処理によるノイズ除去で、 意味認識し易く改善 することが容易である。 そのため、 透かし情報の取り出し処理が簡単で、 また、 そのための情報処理装置も口一コス トで対応可能である。 〔4〕音声の自動認識を追加すれば、 I D情報の自動認識も可能になる < 〔 5〕ディジ夕ル音声 I D情報は、 画像 I D情報の埋め込みのように、 位置パターンで埋め込まないで済むから、 原画像情報に、 多重に埋め込 むことも容易である。ディジ夕ル画像情報に複雑な権利関係に対しても、 その著作権者や権利保有者の I D情報を容易に分離可能に埋め込むこ とができる。 [3] Digital audio ID information is used as digital watermark information that has meaning as ID information, so even if noise is included in the extracted audio ID watermark information, it is only necessary to be able to recognize the meaning, and it is somewhat difficult to recognize the meaning. Even in a simple state, the meaning can be easily recognized and improved by removing noise by filtering. Therefore, the process of extracting the watermark information is easy, and the information processing device for that can be handled at a simple cost. [4] By adding automatic voice recognition, automatic recognition of ID information is also possible. <[5] Digit audio ID information does not have to be embedded in a position pattern like embedding image ID information. It is also easy to embed multiple in the original image information. Even for complicated rights related to digital image information, the ID information of the copyright holder or the rights holder can be easily and separably embedded.
〔 6〕前記特開平 1 0— 2 9 0 4 2 4号公報にあるビデオ装置の音声の 透かしを利用した埋め込み方法、あるいは特開平 1 1一 1 6 8 6 1 6号 公報に記載の画像情報処理手法にある可視音声の埋め込み方法等は、著 作権保護を目的としていないマルチメディァ活用の観点で捉えた発明 であり、 本発明の音声 I D情報の埋め込み方法等に関する発明とは、 着 眼点及び実現手段の点で異なっている。 したがって、 以上説明した透か し埋め込み及び分離方法などは、 それら公知の技術が着眼せず、 且つ達 成することのできない効果を実現するものである。  [6] An embedding method using a watermark of a video device disclosed in the above-mentioned Japanese Patent Application Laid-Open No. H10-290404, or image information described in Japanese Patent Application Laid-Open No. H11-16686 The method of embedding the visible sound in the processing method is an invention that is captured from the viewpoint of using multimedia, which does not protect copyright, and the invention relating to the method of embedding the voice ID information of the present invention is the point of interest. And the means of realization. Therefore, the watermark embedding and separating methods and the like described above realize effects that cannot be achieved and cannot be achieved by the known techniques.
以上本発明者によってなされた発明を実施例に基づいて具体的に説 明したが本発明はそれに限定されるものではなく、その要旨を逸脱しな い範囲において種々変更可能である。  The invention made by the present inventor has been specifically described based on the embodiments, but the present invention is not limited thereto, and can be variously modified without departing from the gist thereof.
例えば、 周波数変換は上記に限定されずゥヱブレツ ト変換 (Wavelet Transform) であってもよい。 また、 透かし情報の基本的な埋め込みァ ルゴリズムは上記に限定されず、原画像情報を多数のプロックに分割し て透かし情報を埋め込むようにしてもよい。 産業上の利用可能性  For example, the frequency transform is not limited to the above, and may be a wavelet transform. The basic embedding algorithm of the watermark information is not limited to the above, and the watermark information may be embedded by dividing the original image information into a number of blocks. Industrial applicability
本発明は、 種々の電子承認、 電子商取引、 電子配信のシステム、 或い はそれに利用するパーソナルコンビュ一夕、携帯通信端末などの種々の 情報処理装置に広く適用することができる。  INDUSTRIAL APPLICABILITY The present invention can be widely applied to various electronic approval, electronic commerce, and electronic distribution systems, or various information processing devices such as personal computers and mobile communication terminals used for the systems.

Claims

請 求 の 範 囲 . ディジ夕ル原画像情報に、 I D情報としての意味を持ったディジ夕 ル音声 I D情報を不可視電子透かし情報として分離可能に埋め込む ことを特徴とする情報処理方法。  Scope of the request. An information processing method characterized by embedding digital audio ID information having meaning as ID information as invisible digital watermark information in original digital image information in a separable manner.
. ディジタル原画像情報を周波数変換する変換処理と、 前記変換処理 で得られた所定周波数成分に I D情報としての意味を持ったディ ジ タル音声 I D情報を不可視電子透かし情報として埋め込む埋め込み 処理と、前記埋め込み処理を経た情報に対して前記周波数変換に対す る逆変換を行う逆変換処理と、を含むことを特徴とする情報処理方法 ( .ディジ夕ル原画像情報に電子透かし情報を埋め込むための複数の埋 め込み位置をディジタル原画像情報を参照して決定する埋め込み位 置決定処理と、決定された埋め込み位置に対する埋め込みの順番を決 定する順序決定処理と、前記埋め込みの順番に従って前記埋め込み位 置に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可 視電子透かし情報として埋め込む処理と、 を含むことを特徴とする情 報処理方法。 A conversion process of frequency-converting the digital original image information, an embedding process of embedding digital audio ID information having meaning as ID information in predetermined frequency components obtained in the conversion process as invisible digital watermark information, An inverse transformation process for performing an inverse transformation of the frequency transformation on the information that has undergone the embedding process. (A plurality of methods for embedding digital watermark information in digitized original image information.) Embedding position determination processing for determining the embedding position of the embedded image with reference to the digital original image information, order determination processing for determining the embedding order for the determined embedding position, and the embedding position according to the embedding order. And embedding digital audio ID information having meaning as ID information as invisible digital watermark information. Information processing method to be considered.
. ディジタル原画像情報を周波数変換する変換処理と、 前記変換処理 で得られた各周波数成分の中から電子透かし情報を埋め込むための複 数の埋め込み周波数成分を埋め込み位置として決定する埋め込み位置 決定処理と、 決定された複数の埋め込み位置に対する埋め込みの順番を決 定する順序決定処理と、 前記埋め込みの順番に従つて前記埋め込み位置に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可視電子透か し情報として埋め込む処理と、 前記埋め込み処理を経た情報に対して前記 周波数変換に対する逆変換を行う逆変換処理と、 を含むことを特徴とする 情報処理方法。 A conversion process for frequency-converting the digital original image information, and an embedding position determination process for determining, as embedding positions, a plurality of embedding frequency components for embedding digital watermark information from the frequency components obtained in the conversion process. An order determining process for determining the order of embedding for the determined plurality of embedding positions; and in accordance with the order of embedding, digitized voice ID information having meaning as ID information is embedded in the embedding position at an invisible electronic transmission. However, the information processing method includes: a process of embedding the information as information; and an inverse conversion process of performing an inverse conversion on the information after the embedding process with respect to the frequency conversion.
5 . 前記ディジ夕ル音声 I D情報の量子化ビッ ト数、 サンプリング周波 数、及び音声時間を規定するディジ夕ル音声構成パラメ一夕を前記デ ィジタル原画像情報に基いて算出するパラメ一夕算出処理を更に含 むことを特徴とする請求の範囲第 3項又は第 4項記載の情報処理方 法。 5. Parameter / digital calculation for calculating the digital / audio data configuration parameters defining the number of quantization bits, sampling frequency, and voice time of the digital / audio data ID information based on the digital original image information. 5. The information processing method according to claim 3, further comprising a process.
6 .パラメ一夕テーブルから前記ディジ夕ル音声 I D情報の量子化ビッ ト数、 サンプリング周波数、 及び音声時間を規定するディジ夕ル音声 構成パラメ一夕を選択するパラメ一夕選択処理を更に含むことを特 徴とする請求の範囲第 3項又は第 4項記載の情報処理方法。  6. It further includes a parameter selection process for selecting a digit parameter configuration parameter that specifies the number of quantization bits, sampling frequency, and audio time of the digit parameter ID information from the parameter parameter table. The information processing method according to claim 3 or 4, wherein the information processing method is characterized in that:
7 · 前記ディジタル音声 I D情報は、 3ビッ ト乃至 1 6ビッ トから選ば れた何れか 1つの量子化ビッ ト数、 3〜4 0 k H zの範囲から選ばれ たサンプリング周波数の音声パラメ一夕によって特定される音声情 報であることを特徴とする請求の範囲第 3項又は第 4項記載の情報 処理方法。 7 · The digital audio ID information is an audio parameter of a sampling frequency selected from the range of 3 to 40 kHz, the number of quantization bits selected from 3 to 16 bits. 5. The information processing method according to claim 3, wherein the information is audio information specified by evening.
8 . 前記ディジタル音声 I D情報は、 1ビッ トオーバサンプリング方式 で量子化されて形成され、 3〜4 0 k H zのサンプリング周波数の音 声パラメ一夕によって特定される音声情報であることを特徴とする 請求の範囲第 3項又は第 4項記載の情報処理方法。 8. The digital voice ID information is characterized by being formed by quantizing with a 1-bit oversampling method and specified by voice parameters having a sampling frequency of 3 to 40 kHz. The information processing method according to claim 3 or 4.
9 . 前記埋め込み位置決定処理は、 ディジ夕ル音声 I D情報が既に不可 視電子透かし情報として埋め込まれているとき、既に埋め込み済みの 複数の埋め込み位置の重心とは異なる重心を採るように今回の音声 I D情報の埋め込み位置を決定することを特徴とする請求の範囲第 3 項又は第 4項記載の情報処理方法。 9. The embedding position determination processing is performed such that when the digital audio ID information is already embedded as invisible digital watermark information, the current audio is taken to have a center of gravity different from the centroids of a plurality of already embedded embedding positions. 5. The information processing method according to claim 3, wherein an embedding position of the ID information is determined.
1 0 .ディジ夕ル原画像情報に I D情報としての意味を持ったディジ夕 ル音声 I D情報を不可視電子透かし情報として分離可能に埋め込ん だ情報から、 前記ディジ夕ル音声 I D情報を分離する分離処理と、 分 離したディジ夕ル音声 I D情報を再生する音声再生処理とを含むこ とを特徴とする情報処理方法。 10 Separation processing for separating the digital audio ID information from the information in which the digital audio ID information having meaning as the ID information is embedded in the original digital image information in a separable manner as invisible digital watermark information. And minutes An audio reproduction process for reproducing the separated digital audio ID information.
1 1 .ディジ夕ル原画像情報の所定周波数成分に I D情報としての意味 を持ったディジタル音声 I D情報が不可視電子透かし情報として埋 め込まれている情報に対して周波数変換を行う周波数変換処理と、周 波数変換処理で得られた情報から前記ディジ夕ル音声 I D情報を分 離する分離処理と、分離したディジ夕ル音声 I D情報を再生する音声 再生処理とを含むことを特徴とする情報処理方法。  11. Frequency conversion processing for performing frequency conversion on information in which digital audio ID information having meaning as ID information is embedded as invisible digital watermark information in a predetermined frequency component of the original digital image information And a sound reproducing process for separating the digitized audio ID information from the information obtained by the frequency conversion process, and a sound reproducing process for reproducing the separated digitized voice ID information. Method.
1 2 .ディジ夕ル原画像情報に I D情報としての意味を持ったディジ夕 ル音声 I D情報を不可視電子透かし情報として分離可能に埋め込む 処理と、前記埋め込まれた情報から前記ディジ夕ル音声 I D情報を分 離する処理と、を情報処理装置に実行させるためのプログラムを情報 処理装置に読み取り可能に記録したものであることを特徴とする記 録媒体。  1 2. A process for embedding the digital audio ID information having meaning as ID information in the original digital image information in a separable manner as invisible digital watermark information, and processing the digital audio ID information from the embedded information. A recording medium characterized by recording a program for causing an information processing device to execute a process for separating the information and a program for causing the information processing device to read the program.
1 3 . ディジ夕ル原画像情報を周波数変換する第 1変換処理と、 前記第 1変換処理で得られた所定周波数成分に I D情報としての意味を持 つたディジタル音声 I D情報を不可視電子透かし情報として埋め込 む埋め込み処理と、前記埋め込み処理を経た情報に対して前記周波数 変換に対する逆変換を行う逆変換処理と、逆変換された情報を周波数 成分に変換する第 2変換処理と、第 2変換処理で得られた情報から前 記ディジ夕ル音声 I D情報を分離する分離処理と、 を情報処理装置に 実行させるためのプログラムを情報処理装置に読み取り可能に記録 したものであることを特徴とする記録媒体。 1 3. First conversion processing for frequency-converting the original digital image information, and digital audio ID information having meaning as ID information at a predetermined frequency component obtained in the first conversion processing as invisible digital watermark information. An embedding process for embedding, an inverse conversion process for performing an inverse conversion on the frequency conversion on the information that has passed through the embedding process, a second conversion process for converting the inversely converted information into frequency components, and a second conversion process The recording process is characterized in that a separation process for separating the above-mentioned digitized voice ID information from the information obtained in step (1) and a program for causing the information processing apparatus to execute (3) are recorded in the information processing apparatus in a readable manner. Medium.
4 . 画像情報を情報処理装置に読取り可能に記録した記録媒体であつ て、 前記画像情報は、 ディジタル原画像情報に、 I D情報としての意 味を持ったディジ夕ル音声 I D情報が不可視電子透かし情報として 分離可能に埋め込まれた情報であることを特徴とする記録媒体。 4. A recording medium in which image information is recorded in an information processing device in a readable manner, wherein the image information is a digital original image information, and a digital audio ID having meaning as ID information is invisible digital watermark. As information A recording medium comprising information embedded in a separable manner.
5 .画像情報を情報処理装置に読取り可能に記録した記録媒体であつ て、 前記画像情報は、 ディジ夕ル原画像情報の所定周波数成分に I D 情報としての意味を持ったディジ夕ル音声 I D情報が不可視電子透 かし情報として埋め込まれ、周波数変換を介して前記ディジ夕ル音声 5. A recording medium in which image information is recorded in an information processing apparatus in a readable manner, wherein the image information is digitized audio ID information having a meaning as ID information at a predetermined frequency component of the digitized original image information. Is embedded as invisible electronic watermark information, and the digital audio
I D情報が分離可能にされた情報である、 ことを特徴とする記録媒体 c 6 . 画像情報を情報処理装置に読取り可能に記録した記録媒体であつ て、 前記画像情報は、 I Dとして意味を持ったディジタル音声 I D倩 報が不可視電子透かし情報としてディジ夕ル原画像情報の標本値領 域に、又は前記ディジ夕ル原画像情報の周波数変換された周波数領域 に埋め込まれ、 且つ埋め込まれた前記ディジ夕ル音声 I D情報が情報 処理装置によって前記標本値領域又は周波数領域から分離可能にさ れた情報であることを特徴とする記録媒体。 A recording medium characterized in that the ID information is separable information; c6. A recording medium on which image information is recorded in an information processing apparatus in a readable manner, wherein the image information has a meaning as an ID. The digital audio ID information is embedded as invisible digital watermark information in the sample value area of the original digital image information or in the frequency-converted frequency domain of the original digital image information, and the embedded digital information is embedded. A recording medium, wherein the evening voice ID information is information that can be separated from the sample value area or the frequency area by an information processing device.
7 . I Dとして意味を持ったディジ夕ル音声 I D情報が不可視電子透 かし情報としてディジ夕ル原画像情報の標本値領域に又は前記ディ ジ夕ル原画像情報の周波数変換された周波数領域に分離可能に埋め 込まれた画像情報を出力する第 1の情報処理装置と、 前記画像情報を 入力し、入力した前記画像情報を電子的な承認情報として用いる第 2 の情報処理装置とを含んで成るものであることを特徴とする情報処理 システム。  7. Digit audio having meaning as ID is used as invisible electronic watermark information in the sample value area of the original digital image information or in the frequency-converted frequency domain of the original digital image information. A first information processing device that outputs image information embedded so as to be separable; and a second information processing device that inputs the image information and uses the input image information as electronic approval information. An information processing system, comprising:
8 . I Dとして意味を持ったディジタル音声 I D情報が不可視電子透 かし情報としてディジ夕ル原画像情報の標本値領域に又は前記ディ ジ夕ル原画像情報の周波数変換された周波数領域に分離可能に埋め 込まれた画像情報を出力する第 1の情報処理装置と、 前記画像情報を 入力し、入力した前記画像情報を電子的な署名情報として用いる第 2 の情報処理装置とを含んで成るものであることを特徴とする情報処理 システム。 8. Digital voice ID information with meaning as ID can be separated as invisible electronic watermark information into the sample value area of the original digital image information or into the frequency-converted frequency domain of the original digital image information. A first information processing device that outputs image information embedded in a document, and a second information processing device that inputs the image information and uses the input image information as electronic signature information Information processing characterized by being system.
1 9 . I Dとして意味を持ったディジ夕ル音声 I D情報が不可視電子透 かし情報としてディジ夕ル原画像情報の標本値領域に又は前記ディ ジ夕ル原画像情報の周波数変換された周波数領域に分離可能に埋め 込まれた画像情報を格納し、格納した画像情報を配信要求に応答して 出力する情報処理装置を含んで成るものであることを特徴とする情報 処理システム。  1 9. Digit audio having meaning as ID The ID information is used as invisible electronic watermarking information in the sample value area of the original digital image information or in the frequency domain obtained by frequency-converting the original digital image information. An information processing system, comprising: an information processing device that stores image information embedded separably in a document and outputs the stored image information in response to a distribution request.
2 0 . 入力手段と演算制御手段とを含み、 前記入力手段はディジタル 元画像情報を入力し、 前記演算制御手段は、 前記入力されたデイジ夕 ル原画像情報に I D情報としての意味を持ったディジ夕ル音声 I D 情報を不可視電子透かし情報として分離可能に埋め込み可能である ことを特徴とする情報処理装置。  20. Input means and arithmetic control means, wherein the input means inputs digital original image information, and the arithmetic control means has a meaning as ID information in the inputted digital original image information. An information processing apparatus characterized in that digitized voice ID information can be separated and embedded as invisible digital watermark information.
2 1 . 記憶手段と演算制御手段とを含み、 前記記憶手段はディジ夕ル原 画像情報を格納し、 前記演算制御手段は、 前記記憶手段に格納された ディジタル原画像情報に周波数変換を行い、前記周波数変換で得られ た所定周波数成分に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可視電子透かし情報として埋め込み、前記埋め込みを経 た情報に対して前記周波数変換に対する逆変換を行うことが可能で あることを特徴とする情報処理装置。 21. A storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit performs frequency conversion on the digital original image information stored in the storage unit, Digitized voice ID information having meaning as ID information is embedded as invisible digital watermark information in the predetermined frequency component obtained by the frequency conversion, and the information after the embedding is inversely converted to the frequency conversion. An information processing apparatus characterized in that the information processing apparatus can perform the processing.
2 2 . 記憶手段と演算制御手段とを含み、 前記記憶手段はディジ夕ル原 画像情報を格納し、 前記演算制御手段は、 前記記憶手段に格納された ディジ夕ル原画像情報を参照して当該ディジ夕ル原画像情報に電子 透かし情報を埋め込むための複数の埋め込み位置を決定し、決定した 埋め込み位置に対する埋め込みの順番を決定し、前記埋め込みの順番 に従って前記埋め込み位置に I D情報としての意味を持ったディジ タル音声 I D情報を不可視電子透かし情報として埋め込むことが可 能であることを特徴とする情報処理装置。 22. A storage unit and an operation control unit, wherein the storage unit stores digital original image information, and the operation control unit refers to the original original image information stored in the storage unit. A plurality of embedding positions for embedding digital watermark information in the original digital image information are determined, an embedding order for the determined embedding position is determined, and the embedding position is given meaning as ID information in accordance with the embedding order. Digital voice ID information can be embedded as invisible digital watermark information An information processing apparatus characterized in that the information processing apparatus is capable of functioning.
3 . 記憶手段と演算制御手段とを含み、 前記記憶手段はディジタル原 画像情報を格納し、 前記演算制御手段は、 前記記憶手段に格納されたディ ジ夕ル原画像情報を周波数変換し、 前記周波数変換で得られた各周波数 成分の中から電子透かし情報を埋め込むための複数の埋め込み周波数 成分を埋め込み位置として決定し、 決定された複数の埋め込み位置に対 する埋め込みの順番を決定し、 前記埋め込みの順番に従って前記埋め込み 位置に I D情報としての意味を持ったディジ夕ル音声 I D情報を不可視電 子透かし情報として埋め込み、 前記埋め込み処理を経た情報に対して前記 周波数変換に対する逆変換を行うことが可能であることを特徴とする情報 4 . 前記演算制御手段は、 前記埋め込み位置を決定するとき、 デイジ タル音声 I D情報が既に不可視電子透かし情報として埋め込まれて いる場合には、既に埋め込み済みの複数の埋め込み位置の重心とは異 なる重心を採るように今回の音声 I D情報の埋め込み位置を決定す ることを特徴とする請求の範囲第 2 3項記載の情報処理装置。  3. Includes storage means and arithmetic control means, wherein the storage means stores digital original image information, wherein the arithmetic control means frequency-converts the digital original image information stored in the storage means, A plurality of embedding frequency components for embedding digital watermark information are determined as embedding positions from among the frequency components obtained by the frequency conversion, and an embedding order for the determined plurality of embedding positions is determined. It is possible to embed digital audio ID information having meaning as ID information in the embedding position as invisible electronic watermark information in the embedding position in accordance with the order of the above, and to perform the inverse conversion for the frequency conversion on the information after the embedding process 4. The arithmetic control means, when determining the embedding position, has digital voice ID information. When embedded as invisible digital watermark information, the embedding position of the current voice ID information is determined so that a center of gravity different from the centroid of a plurality of already embedded embedding positions is adopted. The information processing device according to claim 23.
5 . 入力手段と記憶手段と演算制御手段とを含み、 前記記憶手段は、 ディジ夕ル原画像情報の標本値領域又はディジ夕ル原画像情報の周波数 変換された周波数領域に不可視電子透かし情報を埋め込むための埋め込み 位置とその位置の順序を予め定めた埋め込み制御情報と、 電子透かし情報 として意味を持ったディジ夕ル音声 I D情報の音声構成パラメ一夕を予め 定めた音声パラメ一夕情報とを格納し、 前記入力手段はディジタル原画像 情報を入力し、 前記演算制御手段は、 記憶手段から所要の音声パラメ一夕 と埋め込み制御情報を選択し、ディジ夕ル原画像情報の標本値領域又はデ ィジ夕ル原画像情報の周波数変換された周波数領域に、 前記選択した音声 パラメ一夕に従ったディジタル音声 I D情報を、 前記選択した埋め込み制 御情報で指定される位置と順序で埋め込むものであることを特徴とする情 報処理装置。 5. An input unit, a storage unit, and an arithmetic control unit, wherein the storage unit stores the invisible digital watermark information in a sample value area of the original digital image information or a frequency-converted frequency area of the original digital image information. The embedding position for embedding and the order of the positions are determined in advance by embedding control information, and the audio parameter information of the digital audio ID information, which has meaning as digital watermark information, and the audio parameter information that is predetermined. The input means inputs digital original image information, and the arithmetic and control means selects required voice parameters and embedding control information from the storage means, and stores a sample value area or data of digital original image information. The digital audio ID information according to the selected audio parameter is stored in the frequency domain of the original image information after the frequency conversion. An information processing device characterized by being embedded in a position and an order specified by control information.
6 . 入力手段と記憶手段と演算制御手段とを含み、 前記記憶手段は、 ディジ夕ル原画像情報の標本値領域又はディジ夕ル原画像情報の周波数 変換された周波数領域に不可視電子透かし情報を埋め込むための埋め込み 位置とその位置の順序を予め暗号化して定めた埋め込み制御情報と、 電子 透かし情報として意味を持ったディジ夕ル音声 I D情報の音声構成パラメ 一夕を予め暗号化して定めた音声パラメ一夕情報とを格納し、 前記入力手 段はディジタル原画像情報を入力し、 前記演算制御手段は、 記憶手段から 所要の暗号化音声パラメ一夕及び暗号化埋め込み制御情報を選択し、 選択 した暗号化音声パラメ一夕及び暗号化埋め込み制御情報を復号し、 ディジ 夕ル原画像情報の標本値領域又はディジタル原画像情報の周波数変換さ れた周波数領域に、 前記復号された音声パラメ一夕情報に従ったディジ夕 ル音声 I D情報を、 前記復号された埋め込み制御情報で指定される位置と 順序で埋め込むものであることを特徴とする情報処理装置。  6. Input means, storage means, and arithmetic control means, wherein the storage means stores the invisible digital watermark information in the sample value area of the original digital image information or the frequency-converted frequency area of the original digital image information. The embedding position for embedding and the embedding control information in which the order of the positions are pre-encrypted, and the voice configuration parameters of the digital voice ID information that have meaning as digital watermark information The input means inputs digital original image information, and the arithmetic control means selects required encrypted voice parameters and encrypted embedding control information from the storage means, and selects The decrypted encrypted voice parameter and the encrypted embedded control information are decrypted, and the sampled value area of the digital original image information or the frequency-converted frequency of the digital original image information is decoded. Information processing for embedding digit voice ID information in accordance with the decoded voice parameter information in a wave number domain in a position and an order specified by the decoded embedding control information. apparatus.
7 . 入力手段と演算制御手段とを含み、 前記入力手段は、 ディジタル 原画像情報に I D情報としての意味を持ったディジ夕ル音声 I D情 報を不可視電子透かし情報として分離可能に埋め込んだ情報を入力 し、前記演算制御手段は前記入力した情報から前記ディジタル音声 I D情報を分離可能であることを特徴とする情報処理装置。  7. An input means and an arithmetic control means, the input means being capable of separating digital original image information into which digital audio ID information having meaning as ID information is separably embedded as invisible digital watermark information. An information processing apparatus, wherein the arithmetic control means is capable of separating the digital voice ID information from the input information.
8 . 入力手段と演算制御手段とを含み、 前記入力手段は、 ディジタル 原画像情報の所定周波数成分に I D情報としての意味を持ったディ ジ夕ル音声 I D情報が不可視電子透かし情報として埋め込まれた情 報を入力し、 前記演算制御手段は、 前記入力した情報に対して周波数 変換を行い、前記周波数変換で得られた情報から前記ディジタル音声 I D情報を分離可能であることを特徴とする情報処理装置。 8. Includes input means and arithmetic control means, wherein the input means has digital audio ID information having meaning as ID information embedded in predetermined frequency components of the digital original image information as invisible digital watermark information. Information, and the arithmetic and control unit performs frequency conversion on the input information, and is capable of separating the digital voice ID information from information obtained by the frequency conversion. apparatus.
9 . 前記演算制御手段は更に、 前記分離したディジ夕ル音声 I D情報にノ ィズ除去フィルタリング処理を行うことが可能であることを特徴とする請 求の範囲第 2 7項又は第 2 8項記載の情報処理装置。 9. The claim range, wherein the arithmetic control means is further capable of performing a noise removal filtering process on the separated digitized voice ID information. An information processing apparatus according to claim 1.
0 . 前記演算制御手段は更に、 前記ノイズ除去フィル夕リング処理さ れたディジ夕ル音声 I D情報に対して音声認識処理を行うことが可 能であることを特徴とする請求の範囲第 2 9項記載の情報処理装置。  20. The method according to claim 29, wherein the arithmetic control means is further capable of performing a speech recognition process on the digitized speech ID information subjected to the noise removal filtering process. Item.
PCT/JP1999/006838 1999-12-07 1999-12-07 Information processing method and recorded medium WO2001043422A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP1999/006838 WO2001043422A1 (en) 1999-12-07 1999-12-07 Information processing method and recorded medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP1999/006838 WO2001043422A1 (en) 1999-12-07 1999-12-07 Information processing method and recorded medium

Publications (1)

Publication Number Publication Date
WO2001043422A1 true WO2001043422A1 (en) 2001-06-14

Family

ID=14237489

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP1999/006838 WO2001043422A1 (en) 1999-12-07 1999-12-07 Information processing method and recorded medium

Country Status (1)

Country Link
WO (1) WO2001043422A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG112885A1 (en) * 2002-07-15 2005-07-28 Fuji Electric Co Ltd Magnetic disk medium and a fixed magnetic disk drive unit
JP2010147919A (en) * 2008-12-19 2010-07-01 Yamaha Corp Apparatus and program for embedding and extracting electronic watermark information
JPWO2014141413A1 (en) * 2013-03-13 2017-02-16 株式会社東芝 Information processing apparatus, output method, and program
US11501404B2 (en) * 2019-09-23 2022-11-15 Alibaba Group Holding Limited Method and system for data processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10224342A (en) * 1997-02-05 1998-08-21 Nippon Telegr & Teleph Corp <Ntt> Electronic watermark generating method and reading method therefor
JPH1132200A (en) * 1997-07-09 1999-02-02 Matsushita Electric Ind Co Ltd Watermark data insertion method and watermark data detection method
JPH1169137A (en) * 1997-08-20 1999-03-09 Canon Inc Electronic watermark system, electronic information distribution system and image file device
JPH1198479A (en) * 1997-09-17 1999-04-09 Pioneer Electron Corp Method for superimposing electronic watermark and its system
JPH11144380A (en) * 1997-11-07 1999-05-28 Nec Corp Method and device for preventing illicit copy

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10224342A (en) * 1997-02-05 1998-08-21 Nippon Telegr & Teleph Corp <Ntt> Electronic watermark generating method and reading method therefor
JPH1132200A (en) * 1997-07-09 1999-02-02 Matsushita Electric Ind Co Ltd Watermark data insertion method and watermark data detection method
JPH1169137A (en) * 1997-08-20 1999-03-09 Canon Inc Electronic watermark system, electronic information distribution system and image file device
JPH1198479A (en) * 1997-09-17 1999-04-09 Pioneer Electron Corp Method for superimposing electronic watermark and its system
JPH11144380A (en) * 1997-11-07 1999-05-28 Nec Corp Method and device for preventing illicit copy

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
SG112885A1 (en) * 2002-07-15 2005-07-28 Fuji Electric Co Ltd Magnetic disk medium and a fixed magnetic disk drive unit
JP2010147919A (en) * 2008-12-19 2010-07-01 Yamaha Corp Apparatus and program for embedding and extracting electronic watermark information
JPWO2014141413A1 (en) * 2013-03-13 2017-02-16 株式会社東芝 Information processing apparatus, output method, and program
US11501404B2 (en) * 2019-09-23 2022-11-15 Alibaba Group Holding Limited Method and system for data processing

Similar Documents

Publication Publication Date Title
US5822432A (en) Method for human-assisted random key generation and application for digital watermark system
Petitcolas et al. Attacks on copyright marking systems
US8000495B2 (en) Digital watermarking systems and methods
EP1256086B1 (en) Methods and apparatus for multi-layer data hiding
JP3691415B2 (en) REPRODUCTION DEVICE, REPRODUCTION DEVICE SPECIFICING DEVICE, AND METHOD THEREOF
US8175322B2 (en) Method of digital watermark and the corresponding device, and digital camera which can append watermark
Steinebach et al. Watermarking-based digital audio data authentication
US20080256647A1 (en) System and Method For Tracing Illegally Copied Contents on the Basis of Fingerprint
JP2000083159A (en) Method for distributing and authenticating data set by using water-mark and device therefor
JP2001339700A (en) Digital watermark processor, its insertion method and its detection method
US20030031317A1 (en) Increasing the size of a data-set and watermarking
JP2003304388A (en) Additional information detection processor, apparatus and method for contents reproduction processing, and computer program
Olanrewaju et al. Digital audio watermarking; techniques and applications
JP2001266481A (en) Digital copy preventing processing unit, reproducible recording medium recording digital data processed by the same, digital copy preventing processing method, computer readable recording medium recording program for computer to execute the same method and reproducible recording medium recording digital data processed by the same method
Zamani et al. A novel approach for genetic audio watermarking
US7114072B2 (en) Apparatus and method for watermark embedding and detection using linear prediction analysis
WO2001043422A1 (en) Information processing method and recorded medium
Radhakrishnan et al. Audio content authentication based on psycho-acoustic model
US20020168089A1 (en) Method and apparatus for providing authentication of a rendered realization
JP2005518694A (en) Digital watermark strong against collusion
KR20070057917A (en) Detecting and reacting to protected content material in a display or video drive unit
Kirbiz et al. Decode-time forensic watermarking of AAC bitstreams
Petrovic Audio watermarking in compressed domain
US20040034779A1 (en) Method and apparatus to authenticate digitally recorded information
Aboelezz Watermarking audio files with copyrights

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP KR US

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)