US20040034602A1 - Method and apparatus for watermarking binary computer code - Google Patents
Method and apparatus for watermarking binary computer code Download PDFInfo
- Publication number
- US20040034602A1 US20040034602A1 US10/223,256 US22325602A US2004034602A1 US 20040034602 A1 US20040034602 A1 US 20040034602A1 US 22325602 A US22325602 A US 22325602A US 2004034602 A1 US2004034602 A1 US 2004034602A1
- Authority
- US
- United States
- Prior art keywords
- computer
- data
- computer program
- representation
- encoded
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004590 computer program Methods 0.000 claims abstract description 26
- 238000003780 insertion Methods 0.000 claims abstract description 23
- 230000037431 insertion Effects 0.000 claims abstract description 23
- 230000000694 effects Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims 1
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 2
- 238000012885 constant function Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000007620 mathematical function Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/16—Program or content traceability, e.g. by watermarking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/12—Protecting executable software
- G06F21/121—Restricting unauthorised execution of programs
- G06F21/125—Restricting unauthorised execution of programs by manipulating the program code, e.g. source code, compiled code, interpreted code, machine code
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/36—Preventing errors by testing or debugging software
- G06F11/3604—Software analysis for verifying properties of programs
Definitions
- compiler manufacturers require a method of including a serial number or other information in code produced by a compiler. Additionally, a method of analyzing a copy of the compiled code to determine the serial number or other information is also required.
- a private watermark which is data hidden via steganography, is one method for embedding data in the outputs of licensed programs.
- traditional steganography requires the presence of “low order” bits in the data stream. The low order bits can be changed without the data changing so much that a human can notice the difference. The changed bits, detected when the modified data is compared to the original, can hold the steganographic data. Since traditional stenography changes non-significant low-order bits, steganography is normally applied to digital pictures and sounds which contain non-significant low-order bits.
- a method for generating and auditing a watermark for a compiled computer program is provided.
- the watermark is an integral part of the program and does not appear as an external data item.
- a fixed location in the compiled code is specified and a legal fake instruction that does not affect the operation of the code is inserted.
- a legal fake instruction that does not affect the operation of the code is inserted.
- one value of the digit is encoded as a first type of fake instruction and the other value of the binary digit is encoded as a second type of fake instruction.
- the data itself is inserted into the compiled code at a location or locations determined by a mathematical function.
- a computer executing the compiled code also knows the function and determines the location(s) and removes the data prior to executing the compiled code. If a computer that does not know the function executes the program then it will crash because the inserted data are not legal instructions.
- the data is encrypted prior to being inserted in the code.
- FIG. 1 is a block diagram of a computer system configured to implement an embodiment of the invention
- FIG. 2 is a block diagram depicting the operation of an embodiment that encodes the watermark by inserting a fake instruction
- FIG. 3 is a flowchart of the watermark encoding process of an embodiment of the invention.
- FIG. 4 is a flowchart of the watermark decoding process of an embodiment of the invention.
- FIG. 5 is a block diagram depicting the operation of an embodiment that encodes the watermark by inserting data to be embedded.
- FIG. 1 is a block diagram of a computer system 10 configured to implement an embodiment of the invention.
- the computer system 10 includes a computer 12 , an input device 14 such as a keyboard, and an output device 16 such as a display screen.
- the computer 12 includes a main memory 18 , which may include RAM and NVRAM, and a central processing unit (“CPU”) 20 .
- a compiler 24 , a source code module 26 , a compiled program 30 , a watermark generating process S( ) 32 , and a location generating process R( ) 34 are stored in the memory 18 .
- a key for use by the location generating process, may be stored in the main memory 18 .
- R( ) is a location determining function 34 that determines one or more insertion points within a given compiled binary code.
- R( ) may be a constant function or may depend on the binary.
- R( ) is a random number generator seeded by some part of the compiled code.
- R( ) may be a polynomial with inputs communicated by the compiled binary code.
- a value to be provided to R( ) can be processor specific and stored in the main memory 18 of the computer.
- a first block 40 depicts the unmodified program instructions
- a second block 42 depicts the data to be added, in this example “ 1010000 ”
- a third block 44 depicts the insertion point generated, in this example “4”
- a fourth block 46 depicts the resulting modified instructions.
- the lines of program instructions are numbered sequentially.
- R( ) is called and generates a first insertion point as described above.
- the first bit of the data to be encoded, “1”, is then encoded as a fake move instruction.
- the compiler resources are utilized to identify unused registers which can be used as the arguments of the fake move instruction.
- the “mov1 % edx,% ebp” instruction does nothing because edx is not used in this function.
- the presence of the “mov1” instruction in the compiled program (a change from the original program) encodes the first “1” bit of the data to be embedded.
- the process then loops to call R( ) again to generate a second insertion point offset from the first insertion point.
- the second bit, “0”, of the data is encoded.
- the encoding of the bits can be implemented in various ways.
- a first fake instruction e.g., mov1 is utilized to encode “1” and a second fake instruction is utilized to encode “ 0 ’.
- bit “0” could be encoded as “0”, i.e., no instruction, or as another fake instruction that does nothing such as an “add” instruction that adds operands in unused registers.
- the fake instruction is then inserted at the incremented insertion point. The process continues to loop until all the data bits are encoded into the compiled program.
- R( ) is called to locate the insertion point.
- the fake instruction at the insertion point is decoded by S( ) to generate the first bit, “ 1 ”, of the digital encoded data.
- the deletion of the fake instruction is optional because the fake instruction does not affect the operation of the program.
- the program then loops to decode, and optionally delete, all the fake instructions encoding the watermark.
- the data inserted as watermarks can be public or private. Private data can be encrypted or made private in some other way.
Abstract
Description
- It can be useful to be able to identify the code produced by different compilers to, among other uses, identify non-licensed uses of the compilers and to track errors. Accordingly, compiler manufacturers require a method of including a serial number or other information in code produced by a compiler. Additionally, a method of analyzing a copy of the compiled code to determine the serial number or other information is also required.
- A private watermark, which is data hidden via steganography, is one method for embedding data in the outputs of licensed programs. However, traditional steganography requires the presence of “low order” bits in the data stream. The low order bits can be changed without the data changing so much that a human can notice the difference. The changed bits, detected when the modified data is compared to the original, can hold the steganographic data. Since traditional stenography changes non-significant low-order bits, steganography is normally applied to digital pictures and sounds which contain non-significant low-order bits.
- Steganography in computer code can't be done with the normal methods because computer code does not contain low-order bits. Every bit in the code is important, and changing even one bit can prevent the code from operating correctly.
- Accordingly, improved techniques for inserting identifying watermarks in compiled programs are needed.
- In one embodiment of the invention, a method for generating and auditing a watermark for a compiled computer program is provided. The watermark is an integral part of the program and does not appear as an external data item.
- In another embodiment, a fixed location in the compiled code is specified and a legal fake instruction that does not affect the operation of the code is inserted. For each binary digit of the data to be embedded, one value of the digit is encoded as a first type of fake instruction and the other value of the binary digit is encoded as a second type of fake instruction.
- In another embodiment of the invention, the data itself is inserted into the compiled code at a location or locations determined by a mathematical function. A computer executing the compiled code also knows the function and determines the location(s) and removes the data prior to executing the compiled code. If a computer that does not know the function executes the program then it will crash because the inserted data are not legal instructions.
- In another embodiment of the invention, the data is encrypted prior to being inserted in the code.
- Other features and advantages of the invention will be apparent in view of the following detailed description and appended drawings.
- FIG. 1 is a block diagram of a computer system configured to implement an embodiment of the invention;
- FIG. 2 is a block diagram depicting the operation of an embodiment that encodes the watermark by inserting a fake instruction;
- FIG. 3 is a flowchart of the watermark encoding process of an embodiment of the invention;
- FIG. 4 is a flowchart of the watermark decoding process of an embodiment of the invention; and
- FIG. 5 is a block diagram depicting the operation of an embodiment that encodes the watermark by inserting data to be embedded.
- The invention will now be described, by way of example not limitation, with reference to various embodiments. FIG. 1 is a block diagram of a computer system10 configured to implement an embodiment of the invention. The computer system 10 includes a
computer 12, aninput device 14 such as a keyboard, and anoutput device 16 such as a display screen. Thecomputer 12 includes a main memory 18, which may include RAM and NVRAM, and a central processing unit (“CPU”) 20. Acompiler 24, asource code module 26, a compiledprogram 30, a watermark generating process S( ) 32, and a location generating process R( ) 34 are stored in the memory 18. As will be described below, a key, for use by the location generating process, may be stored in the main memory 18. - A first embodiment of the invention will now be described. The compiler and computer processor agree on a function R( ), which is a
location determining function 34 that determines one or more insertion points within a given compiled binary code. R( ) may be a constant function or may depend on the binary. In one embodiment, R( ) is a random number generator seeded by some part of the compiled code. Alternatively, R( ) may be a polynomial with inputs communicated by the compiled binary code. - In an alternative embodiment, a value to be provided to R( ) can be processor specific and stored in the main memory18 of the computer.
- The operation of the first embodiment will now be described in more detail with reference to FIGS.2-4. Referring to FIG. 2, a
first block 40 depicts the unmodified program instructions, asecond block 42 depicts the data to be added, in this example “1010000”, athird block 44 depicts the insertion point generated, in this example “4”, and afourth block 46 depicts the resulting modified instructions. In this example it is assumed that the lines of program instructions are numbered sequentially. - Referring now to the flowchart of FIG. 3, R( ) is called and generates a first insertion point as described above. The first bit of the data to be encoded, “1”, is then encoded as a fake move instruction. The compiler resources are utilized to identify unused registers which can be used as the arguments of the fake move instruction. Thus, in this case, the “mov1 % edx,% ebp” instruction does nothing because edx is not used in this function. The presence of the “mov1” instruction in the compiled program (a change from the original program) encodes the first “1” bit of the data to be embedded.
- The process then loops to call R( ) again to generate a second insertion point offset from the first insertion point. The second bit, “0”, of the data is encoded. The encoding of the bits can be implemented in various ways.
- In the currently described embodiment a first fake instruction, e.g., mov1 is utilized to encode “1” and a second fake instruction is utilized to encode “0’.
- The bit “0” could be encoded as “0”, i.e., no instruction, or as another fake instruction that does nothing such as an “add” instruction that adds operands in unused registers. The fake instruction is then inserted at the incremented insertion point. The process continues to loop until all the data bits are encoded into the compiled program.
- The auditing and/or removal of the encoded data will now be described with reference to flowchart of FIG. 4. R( ) is called to locate the insertion point. The fake instruction at the insertion point is decoded by S( ) to generate the first bit, “1”, of the digital encoded data. The deletion of the fake instruction is optional because the fake instruction does not affect the operation of the program. The program then loops to decode, and optionally delete, all the fake instructions encoding the watermark.
- A more detailed description of the second embodiment will now be described with reference to FIG. 5 where the unencoded watermark data is copied into the program code at the insertion point. The steps are the same as described above with reference to FIG. 3 except that the data is not encoded as fake instructions that have no effect
- The removal of the watermark is the same as the steps described above with reference to FIG. 4 except that the data is not decoded and the removal is mandatory. The data is not a legal instruction and thus would cause the program to crash. An added benefit of this embodiment is that unauthorized users would not be able to use the compiled program.
- For either embodiment described above, the data inserted as watermarks can be public or private. Private data can be encrypted or made private in some other way.
- . A lot of data can be stored in the watermark this way. In the first embodiment, it is difficult to find (and thus strip out) the watermark data. In the second embodiment, the program will not execute on a processor which does not know the function R( ), even if it supports the same instruction set.
- The invention has now been described with reference to the preferred embodiments. Alternatives and substitutions will now be apparent to persons of ordinary skill in the art. For example, the types of fake instruction which can encode the digital data to be encoded are not limited to the examples described. Additionally, groups of bits or characters could be encoded and inserted at a single insertion point. Accordingly, it is not intended to limit the invention except as provided by the appended claims.
Claims (9)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/223,256 US20040034602A1 (en) | 2002-08-16 | 2002-08-16 | Method and apparatus for watermarking binary computer code |
US11/938,080 US9607133B1 (en) | 2002-08-16 | 2007-11-09 | Method and apparatus for watermarking binary computer code |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/223,256 US20040034602A1 (en) | 2002-08-16 | 2002-08-16 | Method and apparatus for watermarking binary computer code |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/938,080 Division US9607133B1 (en) | 2002-08-16 | 2007-11-09 | Method and apparatus for watermarking binary computer code |
Publications (1)
Publication Number | Publication Date |
---|---|
US20040034602A1 true US20040034602A1 (en) | 2004-02-19 |
Family
ID=31715137
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/223,256 Abandoned US20040034602A1 (en) | 2002-08-16 | 2002-08-16 | Method and apparatus for watermarking binary computer code |
US11/938,080 Active 2028-06-15 US9607133B1 (en) | 2002-08-16 | 2007-11-09 | Method and apparatus for watermarking binary computer code |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/938,080 Active 2028-06-15 US9607133B1 (en) | 2002-08-16 | 2007-11-09 | Method and apparatus for watermarking binary computer code |
Country Status (1)
Country | Link |
---|---|
US (2) | US20040034602A1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2418498A (en) * | 2004-09-23 | 2006-03-29 | Farhan Khan | Protecting software code by creating a unique fingerprint |
US20090120694A1 (en) * | 2007-11-12 | 2009-05-14 | Suryaprakash Kompalli | Associating Auxilliary Data With Digital Ink |
US20090199305A1 (en) * | 2006-08-21 | 2009-08-06 | Koninklijke Philips Electronics N.V. | Controlling distribution of digital content |
CN102968596A (en) * | 2012-10-30 | 2013-03-13 | 南京信息工程大学 | Delete marker-based office open xml (OOX) document digital watermarking method |
WO2018204042A1 (en) * | 2017-05-05 | 2018-11-08 | Mastercard International Incorporated | System and method for data theft prevention |
WO2020000486A1 (en) * | 2018-06-30 | 2020-01-02 | 华为技术有限公司 | Data processing method and device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5559884A (en) * | 1994-06-30 | 1996-09-24 | Microsoft Corporation | Method and system for generating and auditing a signature for a computer program |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7770016B2 (en) * | 1999-07-29 | 2010-08-03 | Intertrust Technologies Corporation | Systems and methods for watermarking software and other media |
US20070271191A1 (en) * | 2000-03-09 | 2007-11-22 | Andres Torrubia-Saez | Method and apparatus for secure distribution of software |
US7061510B2 (en) * | 2001-03-05 | 2006-06-13 | Digimarc Corporation | Geo-referencing of aerial imagery using embedded image identifiers and cross-referenced data sets |
US6934942B1 (en) * | 2001-08-24 | 2005-08-23 | Microsoft Corporation | System and method for using data address sequences of a program in a software development tool |
US7340778B2 (en) * | 2002-07-24 | 2008-03-04 | Macrovision Corporation | Method and apparatus for ensuring the copy protection of digital data |
-
2002
- 2002-08-16 US US10/223,256 patent/US20040034602A1/en not_active Abandoned
-
2007
- 2007-11-09 US US11/938,080 patent/US9607133B1/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5559884A (en) * | 1994-06-30 | 1996-09-24 | Microsoft Corporation | Method and system for generating and auditing a signature for a computer program |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2418498A (en) * | 2004-09-23 | 2006-03-29 | Farhan Khan | Protecting software code by creating a unique fingerprint |
GB2418498B (en) * | 2004-09-23 | 2009-08-05 | Farhan Khan | Software mapping |
US20090199305A1 (en) * | 2006-08-21 | 2009-08-06 | Koninklijke Philips Electronics N.V. | Controlling distribution of digital content |
JP2010501923A (en) * | 2006-08-21 | 2010-01-21 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Digital content distribution control |
US9213808B2 (en) * | 2006-08-21 | 2015-12-15 | Irdeto B.V. | Controlling distribution of digital content |
US20090120694A1 (en) * | 2007-11-12 | 2009-05-14 | Suryaprakash Kompalli | Associating Auxilliary Data With Digital Ink |
US8681129B2 (en) * | 2007-11-12 | 2014-03-25 | Hewlett-Packard Development Company, L.P. | Associating auxiliary data with digital ink |
CN102968596A (en) * | 2012-10-30 | 2013-03-13 | 南京信息工程大学 | Delete marker-based office open xml (OOX) document digital watermarking method |
WO2018204042A1 (en) * | 2017-05-05 | 2018-11-08 | Mastercard International Incorporated | System and method for data theft prevention |
CN110574035A (en) * | 2017-05-05 | 2019-12-13 | 万事达卡国际公司 | system and method for data theft prevention |
WO2020000486A1 (en) * | 2018-06-30 | 2020-01-02 | 华为技术有限公司 | Data processing method and device |
CN110770725A (en) * | 2018-06-30 | 2020-02-07 | 华为技术有限公司 | Data processing method and device |
Also Published As
Publication number | Publication date |
---|---|
US9607133B1 (en) | 2017-03-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Stern et al. | Robust object watermarking: Application to code | |
US9009482B2 (en) | Forensic marking using a common customization function | |
KR101798672B1 (en) | Steganographic messaging system using code invariants | |
El-Khalil et al. | Hydan: Hiding information in program binaries | |
CN1897522B (en) | Water mark embedded and/or inspecting method, device and system | |
US8458476B2 (en) | Watermarking computer program code | |
Chroni et al. | Encoding watermark integers as self-inverting permutations | |
US9607133B1 (en) | Method and apparatus for watermarking binary computer code | |
Collberg et al. | More on graph theoretic software watermarks: Implementation, analysis, and attacks | |
Sion et al. | Resilient rights protection for sensor streams | |
Tayan et al. | A hybrid digital-signature and zero-watermarking approach for authentication and protection of sensitive electronic documents | |
Shahreza | An improved method for steganography on mobile phone. | |
US8141162B2 (en) | Method and system for hiding information in the instruction processing pipeline | |
US20070086060A1 (en) | Encoding apparatus, decoding apparatus, encoding method, computer product, and printed material | |
Collberg et al. | Graph theoretic software watermarks: Implementation, analysis, and attacks | |
Gong et al. | Detecting fingerprints of audio steganography software | |
CN116611032A (en) | Method, system and storage medium for embedding and extracting software watermark in JAR package | |
US7617396B2 (en) | Method and apparatus for watermarking binary computer code with modified compiler optimizations | |
Chroni et al. | Efficient encoding of watermark numbers as reducible permutation graphs | |
Mambo et al. | Fingerprints for copyright software protection | |
Chionis et al. | A dynamic watermarking model for embedding reducible permutation graphs into software | |
Nagra et al. | Software watermarking: Protective terminology | |
JP2002258961A (en) | Method, apparatus and program for embedding sub- information in computer program, storage medium having the same program stored thereon, and method, apparatus and program for reading sub-information from computer program, and storage medium having the same program stored thereon | |
JP2002158859A (en) | Method and system for embedding electronic watermark, recording medium storing program for embedding electronic watermark and medium for recording contents data | |
JP2002300374A (en) | Program to execute electronic watermark information processing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: QUICKSILVER TECHNOLOGY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:RUBIN, OWEN ROBERT;MURRAY, ERIC;REEL/FRAME:013209/0605;SIGNING DATES FROM 20020725 TO 20020807 |
|
AS | Assignment |
Owner name: TECHFARM VENTURES MANAGEMENT, LLC,CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:QUICKSILVER TECHNOLOGY, INC.;REEL/FRAME:018194/0515 Effective date: 20051013 Owner name: TECHFARM VENTURES MANAGEMENT, LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:QUICKSILVER TECHNOLOGY, INC.;REEL/FRAME:018194/0515 Effective date: 20051013 |
|
AS | Assignment |
Owner name: QST HOLDINGS, LLC,CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TECHFARM VENTURES MANAGEMENT, LLC;REEL/FRAME:018224/0634 Effective date: 20060831 Owner name: QST HOLDINGS, LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:TECHFARM VENTURES MANAGEMENT, LLC;REEL/FRAME:018224/0634 Effective date: 20060831 |
|
AS | Assignment |
Owner name: NVIDIA CORPORATION,CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:QST HOLDINGS, L.L.C.;REEL/FRAME:018711/0567 Effective date: 20060219 Owner name: NVIDIA CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:QST HOLDINGS, L.L.C.;REEL/FRAME:018711/0567 Effective date: 20060219 |
|
AS | Assignment |
Owner name: NVIDIA CORPORATION,CALIFORNIA Free format text: CORRECTIVE ASSIGNMENT ON REEL 018711, FRAME 0567;ASSIGNOR:QST HOLDINGS, LLC;REEL/FRAME:018923/0630 Effective date: 20060919 Owner name: NVIDIA CORPORATION, CALIFORNIA Free format text: CORRECTIVE ASSIGNMENT ON REEL 018711, FRAME 0567;ASSIGNOR:QST HOLDINGS, LLC;REEL/FRAME:018923/0630 Effective date: 20060919 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |