CN105404614A - Subject and predicate coding based text watermark embedding and extraction method - Google Patents
Subject and predicate coding based text watermark embedding and extraction method Download PDFInfo
- Publication number
- CN105404614A CN105404614A CN201510743382.1A CN201510743382A CN105404614A CN 105404614 A CN105404614 A CN 105404614A CN 201510743382 A CN201510743382 A CN 201510743382A CN 105404614 A CN105404614 A CN 105404614A
- Authority
- CN
- China
- Prior art keywords
- subject
- text
- predicate
- unicode
- language
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000605 extraction Methods 0.000 title abstract description 4
- 238000000034 method Methods 0.000 claims abstract description 28
- 238000005516 engineering process Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 abstract description 2
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 241001180649 Myrcia group Species 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/10—Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
- G06F21/16—Program or content traceability, e.g. by watermarking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Technology Law (AREA)
- Computer Hardware Design (AREA)
- Computer Security & Cryptography (AREA)
- Document Processing Apparatus (AREA)
Abstract
The invention relates to a subject and predicate coding based text watermark embedding and extraction method. An embedding method comprises: 1) representing each character of watermarking information with Unicode codes to form a Unicode code string; 2) detecting out subjects and predicates of statements in a to-be-embedded text, and storing the subjects and predicates in a set; 3) according to the quantity of the subjects and predicates, dividing the Unicode code string into a plurality of segments, representing each of the subjects and predicates with one of the segments by coding, and giving a number; and 4) storing the subjects and predicates, the Unicode code segments corresponding to the subjects and predicates, and the numbers of the code segments in sequence to form a codebook, and finishing the coding to realize the watermark embedding. An extraction method comprises: finding out the detected subjects and predicates in the text; by reference to the codebook, extracting the Unicode code segments and the numbers of the code segments corresponding to the subjects and predicates; splicing the Unicode code segments according to a number sequence; and converting the obtained Unicode code string into the corresponding character to form the watermark information. The method makes no change for the format and content of the text, has good concealment and robustness, and is simple in algorithm construction and easy to realize.
Description
Technical field
The present invention relates to embedding and the extractive technique of watermark, particularly relate to a kind of Text Watermarking based on subject-predicate language coding and embed and extracting method.
Background technology
Along with the popularization and application of internet and infotech; text message is more and more issued in the mode of numeral, propagate and is used; it is while offering convenience to the study of people, work and life; also create the problems such as text is easily copied illegally and usurps, the intellectual property protection of digital text is subject to the extensive concern of industry.Text Watermarking is a technology of the protection digital text intellecture property occurred in recent years; it embeds copyright information or authentication information (watermark) by certain mode in digital text; when finding that text suffers bootlegging or usurps; these information can be extracted to prove the copyright ownership of text; confirm bootlegging and usurp behavior, the rights and interests of protection text copyright owner or possessor.In addition, Text Watermarking technology also can be used for hiding and transmit the aspects such as secret information, the certification of content of text, the tracking of text message in the text.
Text Watermarking mainly contains two class methods at present---the Text Watermarking based on text formatting and the Text Watermarking based on natural language.Digital watermark based on text formatting utilizes the slight text formatting that changes not easily to be carried out embed watermark information by the feature discovered, as changed line space, word space, character boundary etc.This kind of digital watermark simple structure based on text formatting, is easy to realize, but carries out format conversion to text and just likely the watermark of embedding is destroyed, and robustness is not strong.Text Watermarking technology based on natural language utilizes the grammatical and semantic of content of text to carry out coding to carry out embed watermark information, realize at present more be to be replaced by synonym and syntax transfer pair watermark information is encoded.Compared with the watermark based on text formatting, natural Language Watermarking has better disguised and robustness, and format conversion can not have an impact to watermark.But due to the complicacy of Chinese language, synonym is replaced and syntax conversion likely can produce ambiguity or change semanteme, and it is not suitable for the situation that content of text should not change yet simultaneously.
Summary of the invention
The object of the invention is the deficiency overcoming above prior art, provides a kind of Text Watermarking based on subject-predicate language coding with good disguise and robustness to embed and extracting method, specifically has following technical scheme to realize:
The described Text Watermarking embedding grammar based on subject-predicate language coding, comprises
1) by each character Unicode coded representation of watermark information, a Unicode code string is formed.
2) detect the subject-predicate language of statement in text to be embedded, deposit in a set.
3) according to the subject-predicate language quantity that detects, Unicode code string is divided into some sections, each subject-predicate pragmatic wherein one section carrys out coded representation.Considering that the order changing statement in text may make watermark information correctly not extract, to the given numbering of the Unicode code section that each subject-predicate language is corresponding, during for extracting watermark, splicing Unicode code string according to numbering.
4) store Unicode code section corresponding to each subject-predicate language, this subject-predicate language and numbering corresponding to this subject-predicate successively, form a code book, complete coding, realize the embedding of watermark.
Above-mentioned Unicode coding adopts UTF-16 form, and each character is 4 sexadecimal numbers, forms a hexadecimal Unicode code string.
Described step 2) in detect that the subject-predicate language in text to be embedded comprises the steps:
A) will the text-converted of watermark to be embedded be submitted to be the form of character string;
B) character string of the text of watermark to be embedded is committed to language technology platform LTP and carries out interdependent syntactic analysis, obtain the character string that comprises the XML format of sentence element dependence in text;
C) character string of the XML format obtained is converted to XML file, carries out DOM parsing to XML file, according to the contact between the Key Relationships of sentence element attribute of a relation in XML file and subject-predicate relation, searching loop file, finds out the subject-predicate language of every.
The described further design of Text Watermarking embedding grammar based on subject-predicate language coding is, in described code book every a line subject-predicate language, Unicode code section, number between separate with space respectively.
According to the described Text Watermarking embedding grammar based on subject-predicate language coding, a kind of extracting method of the Text Watermarking based on subject-predicate language coding is proposed, comprise the subject-predicate language found out in detected text, the described code book formed during contrast embed watermark, from code book, take out Unicode code section, numbering that each subject-predicate language is corresponding, Unicode code section is got up by the sequential concatenation of the numbering of correspondence, obtain the Unicode code string representing watermark information, convert corresponding character again to, form the watermark information embedded.
The further design of the extracting method of the described Text Watermarking based on subject-predicate language coding is, the Unicode code section that in described taking-up detected text, each subject-predicate language is corresponding and the step of numbering thereof comprise: each subject-predicate language in the detected text found out and each subject-predicate language in code book are compared one by one, if both are consistent, then from code book, take out Unicode code section, numbering that this subject-predicate language is corresponding.
Advantage of the present invention is as follows:
The present invention proposes a kind of new Text Watermarking and embeds and extracting method, utilizes the subject-predicate language of statement in text to carry out coding to watermark information and carrys out embed watermark.The method does not make any change to text formatting and content, can not produce a bit impact to original text, and the embedding of watermark, without any vestige, can not be discovered and find, have good disguise.Carry out format conversion (comprise and change line space, word space, change character boundary, font, color etc.) to text, adjustment text fragment, change sentence order all can not affect the correct extraction of watermark, have good robustness.Algorithm construction is simple simultaneously, is easy to realize.
Embodiment
Below the present invention program is described in detail.
The Text Watermarking embedding grammar based on subject-predicate language coding that the present embodiment provides, comprise the steps: 1) by the Unicode coded representation of each character UTF-16 form of watermark information, each character is 4 sexadecimal numbers, forms a hexadecimal Unicode code string.2) detect the subject-predicate language in text to be embedded, deposit in a set.3) according to the subject-predicate language quantity detected, Unicode code string is divided into some sections, each subject-predicate pragmatic wherein one section carrys out coded representation, to the given numbering of the Unicode code section that each subject-predicate language is corresponding, according to numbering splicing Unicode code string during for extracting watermark.4) store each subject-predicate language, the Unicode code section corresponding with this subject-predicate language successively and number, forming a code book, complete coding, realize the embedding of watermark.Wherein, separate with space respectively between the subject-predicate language of every a line, Unicode code section, numbering in code book.
Further, step 2) in detect that the subject-predicate language in text to be embedded comprises the steps: A) will the text-converted of watermark to be embedded be submitted to be the form of character string.B) character string of the text of watermark to be embedded is committed to language technology platform LTP and carries out interdependent syntactic analysis, obtain the character string that comprises the XML format of sentence element dependence in text.C) character string of the XML format obtained is converted to XML file, carries out DOM parsing to XML file, according to the contact between the Key Relationships of sentence element attribute of a relation in XML file and subject-predicate relation, searching loop file, finds out the subject-predicate language of every.
The above-mentioned language technology platform (LanguageTechnologyPlatform mentioned, LTP) be that a whole set of the open online Chinese natural language disposal system developed for 10 years is lasted at Harbin Institute of Technology's social computing and Research into information retrieval center, comprise lexical analysis (participle, part-of-speech tagging and named entity recognition), syntactic analysis (interdependent syntactic analysis), semantic analysis (word sense disambiguation and semantic character labeling) three aspect six term language processing capacity.This platform is opened to the outside world, easy to use.System provides an application programming interfaces (API), and user, according to the application demand of oneself, arranges API parameter, and structure HTTP request, submits to system by content of text, can obtain analysis result online.Mainly use the interdependent syntactic analysis function of LTP herein, text to be analyzed is submitted to platform, obtain the dependence between each composition of statement in text through platform processes.According to dependence, through processing the subject-predicate language obtaining statement further.Its basic process is as follows:
Content of text to be analyzed is converted to character string, API parameter and method of calling are set, the character string comprising content of text is submitted to LTP and carries out interdependent syntactic analysis.Obtain the result of LTP, obtain the file that comprises the XML format of sentence element dependence in text.This file contains para(paragraph), sent(sentence), word(participle) etc. node.Each participle node (word) has with properties: id, is the sequence number of participle in sentence; Cont is participle content; Parent, is No. id of the father node of interdependent syntactic analysis; Relate is corresponding relation.For finding out all subject-predicate languages in text, each section in searching loop text, each sentence, each participle.When searching loop is to word node, attribute relate=if " HED " (HED represents Key Relationships), then the cont value of word node is this predicate, then search this and whether there is such word node, the relate property value of node is that SBV(SBV represents subject-predicate relation), and its parent property value is equal with No. id of predicate node.If existed, the cont value of this word node is exactly this subject.This subject and predicate are extracted, is stored in a set.
Below provide the concrete function code and annotation that realize watermark embed process:
According to the requirement of LTP system application interface API, the text-converted analyzed is submitted to be the form of character string by needing, can realize with System.IO.File.ReadAllText (stringpath, the Encoding.Default) function of C# language, corresponding program code is:
Stringtext=System.IO.File.ReadAllText(path,Encoding.Default);
Wherein, path is the path of text to be analyzed, and text is the character string comprising content of text.
API parameter is set, comprise the address urlbase of access LTPWeb service, obtain when using the key api_key(user of API to register), analytical model pattern(selects dp, interdependent syntactic analysis), result Format Type format(selects XML format), HTTP request mode (selecting GET mode) etc., the character string (text) comprising content of text is submitted to LTP platform and carries out interdependent syntactic analysis.The core code realizing this process is as follows:
stringurlbase="http://api.ltp-cloud.com/analysis/";
stringapi_key="k2r3q7tqGgWp5zBZRSnEHvNKfTRSFhjMtnHQ0QeP";
stringpattern="dp";
stringformat="xml";
stringstrParam=("api_key="+api_key+"&text="+text.ToString()
+"&pattern="+pattern+"&format="+format);
Encodingencoding=Encoding.GetEncoding("utf-8");
HttpWebRequestreq=
WebRequest.Create(urlbase+strParam)asHttpWebRequest;
req.Method="GET";
Obtain the result of LTP, obtain the character string that comprises the XML format of sentence element dependence in text.This process can realize by the StreamReader class of C# language, and corresponding program code is:
HttpWebResponsewebResponse=req.GetResponse()asHttpWebResponse;
StreamReaderstreamReader=
newStreamReader(webResponse.GetResponseStream(),encoding);
Stringresult=streamReader.ReadToEnd();
The result be after process deposited in result.
The character string of the XML format obtained is converted to XML file, DOM parsing is carried out to XML file, according to the HED(Key Relationships of relate attribute of every in interdependent syntactic analysis result) and SBV(subject-predicate relation) between contact, each section of searching loop, each sentence, each participle, finds out the subject-predicate language of every.The core code realizing this process is as follows:
XmlDocumentdoc=newXmlDocument (); // be converted to XML file
doc.LoadXml(result);
XmlElementroot=doc.DocumentElement; // searching loop
XmlNodeListlist1,list2,list3;
XmlNodelist4;
list1=root.SelectNodes("//para");
Foreach (XmlNodenode1inlist1) { // searching loop para node
list2=node1.ChildNodes;
Foreach (XmlNodenode2inlist2) { // searching loop sent node
list3=node2.ChildNodes;
Foreach (XmlNodenode3inlist3) { // searching loop word node
If (node3.Attributes [" relate "] .InnerText==" HED ") // judge predicate
list4=node3;
foreach(XmlNodenode4inlist3){
If (node4.Attributes [" parent "] .InnerText==list4.Attributes [" id "] .InnerText & & node4.Attributes [" relate "] .InnerText==" SBV ") // judge subject hs.Add (node4.Attributes [" cont "] .InnerText+list4.Attributes [" cont "] .InnerText+ " ");
}
}
}
}
}
List<string>sbv=newList<string>();
sbv.AddRange(hs);
The subject-predicate language be in text deposited in set sbv.
By needing the watermark information embedded to encode according to UTF-16, convert a Unicode code string to.Corresponding code is:
byte[]bts=Encoding.Unicode.GetBytes(info);
for(inti=0;i<bts.Length;i+=2)
uc+=bts[i+1].ToString("x").PadLeft(2,'0')+bts[i].ToString("x").PadLeft(2,'0');
Wherein, what deposit in info is watermark information, is the Unicode code string of generation in uc.
With the subject-predicate language in the set of subject-predicate language, above-mentioned Unicode code string is encoded.Take out each the subject-predicate language in set successively, for it distributes one section of Unicode code, and a given numbering, separate with space respectively between subject-predicate language, Unicode code section, numbering, form code book.The core code realizing this process is as follows.
Be defined as the code of the figure place of the Unicode code that each subject-predicate language distributes:
StrU_size=uc.length (); //strU_size is the figure place of Unicode code string
Sbv_size=sbv.Count; //sbv_size is the number of subject-predicate language
Count_size=strU_size/sbv_size; //count_size is the figure place for subject-predicate language distribution Unicode code
For each subject-predicate language distributes the code of one section of Unicode code:
for(intx=0;x<sbv_size;x++){
If (x==sbv_size-1) // be last subject-predicate language distribution Unicode code (figure place is different, processes separately)
code_list.Add(sbv[x]+""+uc.ToString().Substring(x*count_size)+""+(x+1));
Else{ // be subject-predicate language above distributes Unicode code (mean allocation, figure place is identical)
if(x*count_size-1>0){
code_list.Add(sbv[x]+""+uc.ToString().Substring(x*count_size,count_size)+""+(x+1));
}else{
code_list.Add(sbv[x]+""+uc.ToString().Substring(0,x*count_size+count_size)+""+(x+1));
}
}
}
What set code_list deposited is codebook content, is write a txt file, just obtains the codebook file of embed watermark.
According to the above-mentioned Text Watermarking embedding grammar based on subject-predicate language coding, propose a kind of extracting method of the Text Watermarking based on subject-predicate language coding, its embodiment is:
When needs extract watermark, submit to LTP platform to carry out interdependent syntactic analysis in detected text, analysis result is further processed to the subject-predicate language obtained in text, deposits in a set.Identical when the code realizing this process and embed watermark above.
The codebook file formed when opening embed watermark, contrast code book, carries out decoding to each the subject-predicate language in above-mentioned set.Namely successively each the subject-predicate language in set and each subject-predicate language in code book are compared one by one, if both are consistent, then Unicode code section corresponding for this subject-predicate language and numbering thereof are taken out.The each Unicode code section obtained is stitched together by its number order, obtains the Unicode code string representing watermark information.
Below provide code and the annotation of the main operation realizing said process:
Read every a line of code book, put it into an array.Code is:
string[]lines=File.ReadAllLines(path);
Wherein, path is the path of codebook file, and lines is the array comprising the every a line content of code book.
According to space, the subject-predicate language of every a line is split, compare one by one with the subject-predicate language in the detected text deposited in foregoing assemblage, if any consistent person, take out Unicode code section and the numbering thereof of this row, put into a set.Code is:
for(inti=0;i<sbv.Count;i++){
for(intj=0;j<lines.Length;j++){
string[]lgs=lines[j].ToString().Split(newChar[]{''},2);
if(sbv[i]==lgs[0])
st.Add(lgs[1]);}
}
St is the set of depositing Unicode code section and numbering thereof.
According to space, each Unicode code Duan Yuqi is numbered separated, according to number order, each code section is stitched together, obtains the Unicode code string representing watermark information.Code is:
for(intx=0;x<st.Count;x++){
for(inty=0;y<st.Count;y++){
string[]lgs=st[y].ToString().Split(newChar[]{''},2);
if(Convert.ToInt32(lgs[1])==(x+1))
drawUc.Append(lgs[0]);
}
}
The Unicode code string representing watermark information is in drawUc.
According to the UTF-16 coding rule used during embed watermark, above-mentioned Unicode code string is converted to corresponding character, just obtains the watermark information embedded.The core code realizing this process is as follows:
MatchCollectionmc=Regex.Matches(str,"([\w]{2})([\w]{2})",
RegexOptions.Compiled|RegexOptions.IgnoreCase);
byte[]bts=newbyte[2];
foreach(Matchminmc){
bts[0]=(byte)int.Parse(m.Groups[2].Value,NumberStyles.HexNumber);
bts[1]=(byte)int.Parse(m.Groups[1].Value,NumberStyles.HexNumber);
toStr+=Encoding.Unicode.GetString(bts);
}
The watermark information extracted is contained by toStr.
Claims (6)
1., based on a Text Watermarking embedding grammar for subject-predicate language coding, it is characterized in that comprising
1) by each character Unicode coded representation of watermark information, a Unicode code string is formed;
2) detect the subject-predicate language of statement in text to be embedded, deposit in a set;
3) according to the subject-predicate language quantity detected, Unicode code string is divided into some sections, each subject-predicate pragmatic wherein one section carrys out coded representation, to the given numbering of the Unicode code section that each subject-predicate language is corresponding, according to numbering splicing Unicode code string during for extracting watermark;
4) store Unicode code section corresponding to each subject-predicate language, this subject-predicate language and numbering corresponding to this subject-predicate successively, form a code book, complete coding, realize the embedding of watermark.
2. the Text Watermarking embedding grammar based on subject-predicate language coding according to claim 1, it is characterized in that described Unicode encodes and adopt UTF-16 form, each character is 4 sexadecimal numbers, forms a hexadecimal Unicode code string.
3. Text Watermarking embedding grammar according to claim 1, is characterized in that described step 2) in detect that the subject-predicate language in text to be embedded comprises the steps:
To the text-converted of watermark to be embedded be submitted to be the form of character string;
B) character string of the text of watermark to be embedded is committed to language technology platform LTP and carries out interdependent syntactic analysis, obtain the character string that comprises the XML format of sentence element dependence in text;
C) character string of the XML format obtained is converted to XML file, carries out DOM parsing to XML file, according to the contact between the Key Relationships of sentence element attribute of a relation in XML file and subject-predicate relation, searching loop file, finds out the subject-predicate language of every.
4. the Text Watermarking embedding grammar based on subject-predicate language coding according to claim 1, is characterized in that separating with space respectively between the subject-predicate language of every a line in described code book, Unicode code section, numbering.
5. the Text Watermarking embedding grammar based on subject-predicate language coding according to any one of claim 1-4, a kind of extracting method of the Text Watermarking based on subject-predicate language coding is proposed, it is characterized in that, comprise: find out the subject-predicate language in detected text, the described code book formed during contrast embed watermark, Unicode code section, numbering that each subject-predicate language is corresponding is taken out from code book, Unicode code section is got up by the sequential concatenation of the numbering of correspondence, obtain the Unicode code string representing watermark information, convert corresponding character again to, form the watermark information embedded.
6. the extracting method of the Text Watermarking based on subject-predicate language coding according to claim 5, it is characterized in that the step of Unicode code section that in described taking-up detected text, each subject-predicate language is corresponding and numbering comprises: each subject-predicate language in the detected text found out and each subject-predicate language in code book are compared one by one, if both are consistent, then from code book, take out Unicode code section, numbering that this subject-predicate language is corresponding.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510743382.1A CN105404614B (en) | 2015-11-05 | 2015-11-05 | A kind of Text Watermarking insertion and extracting method based on subject-predicate language coding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510743382.1A CN105404614B (en) | 2015-11-05 | 2015-11-05 | A kind of Text Watermarking insertion and extracting method based on subject-predicate language coding |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105404614A true CN105404614A (en) | 2016-03-16 |
CN105404614B CN105404614B (en) | 2018-05-25 |
Family
ID=55470109
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510743382.1A Active CN105404614B (en) | 2015-11-05 | 2015-11-05 | A kind of Text Watermarking insertion and extracting method based on subject-predicate language coding |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105404614B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491423A (en) * | 2016-06-12 | 2017-12-19 | 北京云量数盟科技有限公司 | A kind of Chinese document gene based on numeric character string hybrid coding quantifies and characterizing method |
CN108363910A (en) * | 2018-01-23 | 2018-08-03 | 南通大学 | A kind of insertion of the webpage watermark based on HTML code and extracting method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101639826A (en) * | 2009-09-01 | 2010-02-03 | 西北大学 | Text hidden method based on Chinese sentence pattern template transformation |
CN102184243A (en) * | 2011-05-17 | 2011-09-14 | 沈阳化工大学 | Text-type attribute-based relational database watermark embedding method |
US20140016814A1 (en) * | 2012-07-13 | 2014-01-16 | International Business Machines Corporation | Hierarchical and index based watermarks represented as trees |
-
2015
- 2015-11-05 CN CN201510743382.1A patent/CN105404614B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101639826A (en) * | 2009-09-01 | 2010-02-03 | 西北大学 | Text hidden method based on Chinese sentence pattern template transformation |
CN102184243A (en) * | 2011-05-17 | 2011-09-14 | 沈阳化工大学 | Text-type attribute-based relational database watermark embedding method |
US20140016814A1 (en) * | 2012-07-13 | 2014-01-16 | International Business Machines Corporation | Hierarchical and index based watermarks represented as trees |
Non-Patent Citations (2)
Title |
---|
ZUNERA JALIL ET AL: "A Review of Digital Watermarking Techniques for Text Documents", 《2009 INTERNATIONAL CONFERENCE ON INFORMATION AND MULTIMEDIA TECHNOLOGY》 * |
斯琴 等: "基于文本特征的文本水印算法", 《计算机应用》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107491423A (en) * | 2016-06-12 | 2017-12-19 | 北京云量数盟科技有限公司 | A kind of Chinese document gene based on numeric character string hybrid coding quantifies and characterizing method |
CN107491423B (en) * | 2016-06-12 | 2021-03-30 | 北京云量数盟科技有限公司 | Chinese document gene quantization and characterization method based on numerical value-character string mixed coding |
CN108363910A (en) * | 2018-01-23 | 2018-08-03 | 南通大学 | A kind of insertion of the webpage watermark based on HTML code and extracting method |
CN108363910B (en) * | 2018-01-23 | 2020-01-10 | 南通大学 | Webpage watermark embedding and extracting method based on HTML (Hypertext markup language) code |
Also Published As
Publication number | Publication date |
---|---|
CN105404614B (en) | 2018-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chang et al. | Practical linguistic steganography using contextual synonym substitution and a novel vertex coding method | |
Rodrıguez et al. | Anaphoric annotation of wikipedia and blogs in the live memories corpus | |
CN105205355B (en) | A kind of Text Watermarking insertion and extracting method based on the mapping of semantic role position | |
CN103761459B (en) | A kind of document multiple digital watermarking embedding, extracting method and device | |
CN108459874A (en) | Code automatic summarization method integrating deep learning and natural language processing | |
TW201142628A (en) | Method and system for compiling a unique sample code for specific web content | |
EP1515240A2 (en) | Chinese word segmentation | |
US8750630B2 (en) | Hierarchical and index based watermarks represented as trees | |
CN103778200A (en) | Method for extracting information source of message and system thereof | |
CN106095735A (en) | A kind of method plagiarized based on deep neural network detection academic documents | |
CN103294959A (en) | Text information hiding method resistant to statistic analysis | |
CN103544408A (en) | Method for embedment and extraction of PDF document hidden information according to composite font | |
CN107871002A (en) | A kind of across language plagiarism detection method based on fingerprint fusion | |
Chen et al. | Text watermarking algorithm based on semantic role labeling | |
Al-Wesabi | A smart English text zero-watermarking approach based on third-level order and word mechanism of Markov model | |
CN102682248B (en) | Watermark embedding and extracting method for ultrashort Chinese text | |
Liu et al. | Keyword-aware abstractive summarization by extracting set-level intermediate summaries | |
CN105404614A (en) | Subject and predicate coding based text watermark embedding and extraction method | |
CN106407288B (en) | Method and system for synchronously updating information | |
Chaudhary et al. | Text steganography based on feature coding method | |
CN104331400A (en) | Mongolian code conversion method and device | |
Zhang et al. | Hierarchical Semantic Enhancement Network for Multimodal Fake News Detection | |
WO2020139563A1 (en) | Information processing method, hidden information parsing and embedding method, apparatus, and device | |
Ji et al. | Coverless information hiding method based on the keyword | |
CN108255866B (en) | Method and device for checking links in website |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |