CN106484781A - Indonesia's Chinese cross-language retrieval method of fusion association mode and user feedback and system - Google Patents
Indonesia's Chinese cross-language retrieval method of fusion association mode and user feedback and system Download PDFInfo
- Publication number
- CN106484781A CN106484781A CN201610827858.4A CN201610827858A CN106484781A CN 106484781 A CN106484781 A CN 106484781A CN 201610827858 A CN201610827858 A CN 201610827858A CN 106484781 A CN106484781 A CN 106484781A
- Authority
- CN
- China
- Prior art keywords
- language
- chinese
- tli
- user
- document
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3332—Query translation
- G06F16/3337—Translation of the query language, e.g. Chinese to English
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
- G06F16/3326—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses Indonesia's Chinese cross-language retrieval method of a kind of fusion association mode and user feedback and system,Using machine translation module, Indonesian user's query translation is submitted to search engine module retrieval for Chinese queries and obtain initial survey set of result documents,Click on behavior related feedback information extraction module using user and obtain user feedback initial survey set of relevant documents,Obtain initial survey relevant documentation data base through document pretreatment module pretreatment,All-weighted association is called to excavate module construction all-weighted association storehouse,Set up extension dictionary using across language inquiry expansion word generation module,Realizing module using across language inquiry extension submits to search engine module to obtain final retrieval result Chinese document the new inquiry after combining again,Using final result display module, last retrieval result submission machine translation module is translated as after Indonesian document returning to user.The present invention effectively improves and improves cross-language retrieval performance, has preferable actual application value and promotion prospect.
Description
Technical field
The invention belongs to document information retrieval field, specifically a kind of Indonesia's Chinese merging association mode and user feedback across
Language retrieval method and system are it is adaptable to adopt the neck such as across language text information retrieval of Indonesian query and search Chinese document
Domain.
Background technology
Cross-language information retrieval refers to a kind of technology of the information resources of other language of query and search of language.Indonesia
Chinese cross-language information retrieval method is the cross-language retrieval problem with Indonesian query and search Chinese document, wherein, expression inquiry
Indonesia's language be referred to as original language, the Chinese language of the document retrieved is referred to as object language.Hand over China and ASEAN countries
Stream is increasingly closer, and the cross-language information retrieval method research towards ASEAN countries' language seems urgent and important.
Scholar has carried out deep spy with direction to cross-language information retrieval method and system from different angles all over the world
Beg for and study, achieve abundant achievement, however, the problems of current cross-language information retrieval research does not also solve completely
Certainly, this field is urgently to be resolved hurrily and one of the higher problem of attention rate is seriously to inquire about present in cross-language information retrieval process
Topic drift problem, is faced with the word mismatch problem even more serious than single language retrieval, and these problems frequently result in across language
Retrieval degraded performance, not as single language retrieval performance.For the problems referred to above, in recent years, the cross-language information based on query expansion
Retrieval research has obtained more concerns and has discussed, its research is concentrated mainly on (Parton K, Gao based on relevant feedback
J.Combining Signals for Cross-Lingual Relevance Feedback[C].Proceedings
of8thAsia Information Retrieval Societies Conference(AIRS 2012),Tianjin,
China.Springer-Verlag Berlin Heidelberg2012,LNCS 7675,Information Retrieval
Technology.2012:356-365.Lee C J,Croft W B.Cross-Language Pseudo-Relevance
Feedback Techniques for Informal Text[C].Proceedings of 36th European
Conference on IR Research(ECIR 2014),Amsterdam,The Netherlands.Advances in
Information Retrieval.Springer International Publishing,2014:260-272.), potential language
Justice (close that sword is graceful, Su Yidan. across the language inquiry extended method [J] based on latent semantic analysis. computer engineering, 2009,35
(10):49-53. is rather good for, and woods is gone away for some great undertakings. based on the cross-language retrieval [J] improving latent semantic analysis. Journal of Chinese Information Processing, and 2010,
24(3):105-111.), language model and topic model (Ganguly Debasis and Leveling Johannes
and Jones Gareth J.F.Cross-lingual topical relevance models[C].In:24th
International Conference on Computational Linguistics(COLING 2012),2012.;Wang
Xuwen,Zhang Qiang,Wang Xiaojie,et al.LDA based pseudo relevance feedback for
cross language information retrieval[C].IEEE International Conference on
Cloud Computing and Intelligence Systems(CCIS2012).Hangzhou:IEEE,2012:1993-
1998.;Xuwen Wang,Qiang Zhang,Xiaojie Wang,et al.Cross-lingual Pseudo
Relevance Feedback Based on Weak Relevant Topic Alignment.Proceedings ofthe
29th Pacific Asia Conference on Language,Information and Computation,PACLIC
29,Shanghai,China,2015:The cross-language information retrieval research such as 529-534.), its language object with English is mainly
Main, it is all the cross-language retrieval problem of research English and other language mostly.
Currently, since Chinese Nanning City is as the permanent host city of China-ASEAN Exposition, the political affairs of China and ASEAN countries
Control, the contact such as economic, cultural more frequently and closely, towards cross-language information retrieval and the cross-language information of ASEAN countries' language
Service research seems more urgent, and its importance increasingly highlights.
Content of the invention
Present invention aims to the problems referred to above of the prior art, by all-weighted association digging technology and
User's relevant feedback is conjointly employed in Indonesia's Chinese cross-language information retrieval, provides a kind of print merging association mode and user feedback
Buddhist nun's Chinese cross-language retrieval method and system, can improve and improve cross-language information retrieval performance in Indonesia, the Indonesia to long inquiry
Middle cross-language retrieval effect is more preferable.
For achieving the above object, present invention employs following technical scheme:
A kind of Indonesia's Chinese cross-language retrieval method merging association mode and user feedback, comprises the steps:
(1) Indonesian user inquiry is translated as Chinese Query formula by machine translation module, and is submitted to search engine
Preliminary search in the Internet, obtains initial survey set of result documents;
(2) extract across language initial survey set of result documents prostatitis r piece Chinese document and submit to user;
(3) user carries out judgement to the Chinese document of across language initial survey set of result documents and obtains user feedback relevant documentation
Collection, the total record of the document in document sets is set to n;
(4) pretreatment user feedback set of relevant documents, that is, carry out Chinese word segmentation, remove stop words, calculate Feature Words weights
With the pretreatment operation extracting Feature Words, build initial survey relevant documentation data base;
(5) scan initial survey relevant documentation data base, excavate complete weighted feature word 1_ candidate C1, calculate C1Weight w
(C1), count C1The maximum weights maxCw of project in additioni(!C1) and C1Support count nc1, ms is minimum support threshold value,
Calculate the value of KIWT (1,2), the computing formula of KIWT (1,2) is:KIWT (1,2)=n × 1 × ms-nC1×maxCwi(!C1);
(6) calculate C1Support FTISup (C1), if FTISup is (C1) ms, then from 1_ candidate C1Dig
Pick 1_ frequent item set L1, and it is added to complete weighted feature word frequent item set set L, FTISup (C1) computing formula be:
(7) excavate k_ item collection, wherein said k 2, including step (7.1) to (7.7):
(7.1) compare candidate (k-1) _ item collection Ck-1(k-1, k) value wipe out its W (C for weights and KIWTk-1)<KIWT(k-1,
K) candidate Ck-1;
(7.2) carry out candidate (k-1) item collection C by remainingk-1Carry out Aproiri connection, obtain Ck;
(7.3) as k=2, wipe out the candidate's 2_ item collection without query term;
(7.4) scan initial survey relevant documentation data base, count CkThe maximum weights maxCw of project in additioni(!Ck) and Ck
Support count nck, calculate CkWeight w (Ck) and KIWT ((k-1, computing formula k) is KIWT for k-1, value k):KIWT(k-
1, k)=n × k × ms-nck×maxCwi(!Ck);
(7.5) wipe out nckCandidate C for 0k;
(7.6) to remaining candidate's k_ item collection Ck, calculate CkSupport FTISup (Ck), if FTISup is (Ck) ms, then
From candidate's k_ item collection CkMiddle excavation k_ frequent item set Lk, and it is added to complete weighted feature word frequent item set set L, FTISup (Ck)
Computing formula be:
(7.7) if it is empty set that k is more than candidate length threshold or candidate's k_ item collection, excavate and terminate, otherwise, continue
Circulation step (7.1) to (7.6);
(8) excavate the Feature Words containing inquiry lexical item from complete weighted feature word frequent item set set L and weight pass completely
Connection rule, builds all-weighted association storehouse;
(9) extract across the language extension word related to former inquiry from all-weighted association storehouse, build extension dictionary;
(10) former inquiry and extension word combination are submitted to search engine and retrieve again and obtain final retrieval result Chinese literary composition
Shelves;
(11) final retrieval result Chinese document submission machine translation module is translated as Indonesian document, finally will be final
Retrieval result Chinese document and final retrieval result Indonesian document return to user.
The calculating of the Feature Words weights described in above-mentioned steps (4) adopts tf-idf method, and its computing formula is:Wherein, tfm,nRepresent Feature Words tmIn document dnIn
Occurrence number, dfmRepresent and contain Feature Words tmNumber of documents, N represents total number of documents in collection of document.
The method of above-mentioned steps (8) includes step (8.1) to (8.4):
(8.1) extract a certain i_ frequent item set tlL of weighting completely from complete weighted feature word frequent item set set Li, look for
Go out tlLiAll proper subclass;
(8.2) from tlLiProper subclass set in arbitrarily take out two proper subclass tlI1And tlI2, whenAnd
And tlI1∪tlI2=LiIf, FTARConf (tlI1→tlI2) mc, then excavate complete weighted feature word Strong association rule
tlI1→tlI2;If FTARConf is (tlI2→tlI1) mc, then excavate complete weighted feature word Strong association rule tlI2→
tlI1;Described mc is minimal confidence threshold, tlI1And tlI2For complete weighted feature word frequent item set, it is tlLiVery son
Collection item collection, FTARConf (tlI1→tlI2) it is complete weighted feature word association rule tlI1→tlI2Confidence level, it calculates public
Formula is:
Wherein, FTISup (Li) it is complete
Weighted frequent items LiSupport, FTISup (tlI1) it is complete weighted frequent items tlI1Support;
(8.3) circulation carries out step (8.2), until weighting i_ frequent item set tlL completelyiProper subclass set in each is true
Subset is all removed once, and is only capable of taking out once, then proceed to step (8.4);
(8.4) circulation carries out step (8.1) to step (8.3), the item in complete weighted feature word frequent item set set L
Collection is all removed once, and is only capable of taking out once, then excavate and terminate.
A kind of searching system of the Indonesia's Chinese cross-language retrieval method being applied to above-mentioned fusion association mode and user feedback,
Including following 4 modules and 3 data bases:
Machine translation module:This module use must answer machine translation interface, in by Indonesian user's query translation being
Query text, and final retrieval result Chinese document is translated as Indonesian document submits to user;
Search engine module:This module is search engine, is examined on the internet for the Chinese Query formula after paginal translation
Rope, obtains across language initial survey set of result documents;
Weighted association pattern excavates and user's relevant feedback module completely:For across language for prostatitis r piece initial survey result is civilian
User submitted to by shelves collection, by user, these documents is carried out with dependency and judges and determine initial survey relevant documentation data base, then adopts
With all-weighted association digging technology to initial survey relevant documentation database mining expansion word associated with the query, realize across language
Retrieval obtains final retrieval result Chinese document again for speech query expansion, expansion word and former inquiry combination;
Final result display module:It is translated as printing for final retrieval result Chinese document is submitted to machine translation module
Buddhist nun's Chinese language shelves, and final retrieval result Chinese document and final retrieval result Indonesian document are returned user;
Initial survey relevant documentation data base;
All-weighted association storehouse;
Extension dictionary.
Above-mentioned complete weighted association pattern excavates and user's relevant feedback module includes following 5 modules:
User clicks on behavior relevant feedback extraction module:For catch user browse produced during initial survey set of result documents
Profile download behavior, extracts the initial survey document structure user feedback set of relevant documents that user downloads;
Document pretreatment module:For user feedback set of relevant documents is carried out Chinese word segmentation, removes stop words, calculates spy
The pretreatment levied word weights and extract Feature Words, builds initial survey relevant documentation data base;
All-weighted association excavates module:For all-weighted association is carried out to initial survey relevant documentation data base
Excavate, excavate the complete weighted feature lexical item frequent item set containing former inquiry lexical item and association rule model, build and weight completely
Correlation rule storehouse;
Across language inquiry expansion word generation module:Related to former inquiry for extracting from all-weighted association storehouse
Expansion word, builds extension dictionary;
Module is realized in across language inquiry extension:For extracting Chinese expansion word from extension dictionary, by expansion word with former look into
Inquiry is combined into new inquiry, submits to search engine again and retrieves in the Internet, obtains final retrieval result Chinese document.
Compared to prior art, advantage of the invention is that:
(1) all-weighted association digging technology and user's relevant feedback are conjointly employed in Indonesia's Chinese across language by the present invention
Speech information retrieval, proposes user and clicks on cross-language information inspection in the Indonesia that download behavior is merged with complete weighted association pattern excavation
Rope method and system.With single language Chinese text retrieve benchmark MB, in Indonesia cross-language retrieval benchmark CLB and traditional based on puppet
The cross-language information retrieval method CLR_PRF of relevant feedback compares, and the retrieval performance of the inventive method obtains very big improvement
And raising, test result indicate that, the present invention obtains good retrieval result, and its indices value is higher than all benchmark CLB and CLR_
The value of PRF algorithm, the retrieval effectiveness of inquiry theme description type is also good than title type, its retrieval result
MAP value increase rate is maximum.
(2) test result indicate that, proposed by the present invention merge complete weighted association pattern and excavate and user's relevant feedback
Indonesia's Chinese cross-language information retrieval method and system are effective, can improve cross-language information retrieval performance.It is main
The analysis of causes is as follows:In cross-language information retrieval, query translation result is larger on the impact of cross-language retrieval result, frequently results in
Across language initial survey outcome quality is not so good as the initial survey result of single language, that is, occur inquiring about topic drift problem.And user is clicked on row
It is to excavate fusion application Cross-Language Infomation Retrieval Models in Indonesia with complete weighted association pattern, it is possible to obtain with former inquiry
Related feedback information, is excavated by all-weighted association and obtains the expansion word realization related to former inquiry across language inquiry
Extension, it is to avoid serious topic drift problem present in cross-language retrieval, improves cross-language retrieval performance in Indonesia.
Brief description
Fig. 1 merges the block diagram of Indonesia's Chinese cross-language retrieval method of association mode and user feedback for the present invention.
Fig. 2 merges Indonesia's Chinese cross-language retrieval system overall flow figure of association mode and user feedback for the present invention.
Fig. 3 merges Indonesia's Chinese cross-language retrieval system architecture diagram of association mode and user feedback for the present invention.
Fig. 4 is that complete weighted association pattern of the present invention excavates and user's relevant feedback modular structure block diagram.
Specific embodiment
With reference to embodiments and its accompanying drawing is further non-limitingly described in detail to technical solution of the present invention.
First, in order to technical scheme is better described, below related notion according to the present invention is described below:
Assume the object language (Target that user's inquiry obtains after across language preliminary search and user's relevant feedback
Language, TL) initial survey set of relevant documents be TLdoc={ tld1,tld2,…,tldn, tldi(1 i n) represents target language
I-th document in speech document sets TLdoc, tldj={ t1,t2,…,tm,…,tp, tm(m=1,2 ..., p) it is referred to as target language
Speech Feature Words project (Feature-term Item, FTI), referred to as characteristic item, usually it is made up of word, word or phrase, tldi
In corresponding Features weight set Wi={ wi1,wi2,…,wim,…,wip},wimFor i-th document tldiIn m-th characteristic item
tmCorresponding weights, make tlI={ t1,t2,…,tkRepresenting all characteristic item set in TLdoc, then subset Y of tlI is referred to as
Feature Words item collection (Feature-term Itemsets) in TLdoc, i.e. item collection Y.
For item collection (tlI1,tlI2),AndAccording to complete weighted association mould
Formula excavation theoretical knowledge (Huang Mingxuan, Yan little Wei, Zhang Shichao. the pseudo-linear filter inquiry based on matrix weights association rule mining
Extension. Journal of Software, Vol.20, No.7, July 2009, pp.1854-1865), provide the following basic conception.
Define 1 Feature Words item collection I (I=(tlI1,tlI2)) complete weighted support measure (Feature-term Itemsets
Support, FTISup) computing formula is as shown in (1) formula.
Wherein,It is the weights of item collection I each piece document in TLdocD
Summation, k is the item length (i.e. project number) of item collection I, and n is the total number of documents of initial survey set of relevant documents TLdoc.
Define correlation rule tlI between 2 words1→tlI2The confidence level (Feature-termAssociation of weighting completely
Rule Confidence, FTARConf) as shown in (2) formula.
Wherein, FTIsup (tlI1,tlI2) it is item collection (tlI1,tlI2) complete weighted support measure.
Define 3 and assume that minimum support threshold value is ms, minimal confidence threshold is mc, if meeting:FTISup(tlI1,
tlI2) ms, FTARConf (tlI1→tlI2) mc, then claim Feature Words item collection (tlI1,tlI2) it is frequent item set, associate between word
Regular (tlI1→tlI2) it is Strong association rule.
Define the 4 Feature Words k_ item collection weight thresholds (k-Item Weighted Threshold, KIWT) comprising q_ item collection
(q<K) refer to the weights prediction to the follow-up item collection comprising q_ item collection.
If tlT is to weight q- item collection completely, andq<K, in (tlI-tlT) item collection, (k-q) individual weights before note
The maximum corresponding weights of project are w1,w2,…wk-q, support in TLdoc for q- item collection tlT is counted as SC (tlT), according to literary composition
Offer (Huang Mingxuan, Yan little Wei, Zhang Shichao. the pseudo-linear filter query expansion based on matrix weights association rule mining. software
Report, Vol.20, No.7, July 2009, pp.1854-1865) k- weight threshold theoretical knowledge, give and comprise q_ item collection
Shown in the computing formula such as formula (3) of Feature Words k_ item collection weight threshold.
Two as shown in figure 1, the Indonesia's Chinese cross-language retrieval method bag merging association mode and user feedback of the present embodiment
Include following steps:
(1) Indonesian user inquiry is translated as Chinese Query formula by machine translation module, and is submitted to search engine
Preliminary search in the Internet, obtains initial survey set of result documents;Machine translation module using machine translation interface must be answered, that is,
Microsoft TranslatorAPI;Search engine module can be the search engines such as existing Baidu or Google;
(2) before extracting across language initial survey set of result documents, r piece Chinese document submits to user;
(3) user carries out judgement to the Chinese document of across language initial survey set of result documents and obtains user feedback relevant documentation
Collection, the total record of the document in document sets is set to n;
(4) pretreatment user feedback set of relevant documents, that is, carry out Chinese word segmentation, remove stop words, calculate Feature Words weights
With the pretreatment operation extracting Feature Words, build initial survey relevant documentation data base;
The calculating of Feature Words weights adopts tf-idf method, and its computing formula is:
Wherein, tfm,nRepresent Feature Words tm?
Document dnIn occurrence number, dfmRepresent and contain Feature Words tmNumber of documents, N represents total number of documents in collection of document;
(5) scan initial survey relevant documentation data base, excavate complete weighted feature word 1_ candidate C1, calculate C1Weight w
(C1), count C1The maximum weights maxCw of project in additioni(!C1) and C1Support count nc1, ms is minimum support threshold value,
Calculate the value of KIWT (1,2), the computing formula of KIWT (1,2) is:KIWT (1,2)=n × 1 × ms-nc1×maxCwi(!C1);
(6) calculate C1Support FTISup (C1), if FTISup is (C1) ms, then from 1_ candidate C1Dig
Pick 1_ frequent item set L1, and it is added to complete weighted feature word frequent item set set L, FTISup (C1) computing formula be:
(7) excavate k_ item collection, wherein k 2, including step (7.1) to (7.7):
(7.1) compare candidate (k-1) _ item collection Ck-1(k-1, k) value wipe out its W (C for weights and KIWTk-1)<KIWT(k-1,
K) candidate Ck-1;
(7.2) carry out candidate (k-1) item collection C by remainingk-1Carry out Aproiri connection, obtain Ck;
(7.3) as k=2, wipe out the candidate's 2_ item collection without query term;
(7.4) scan initial survey relevant documentation data base, count CkThe maximum weights maxCw of project in additioni(!Ck) and Ck
Support count nck, calculate CkWeight w (Ck) and KIWT ((k-1, computing formula k) is KIWT for k-1, value k):KIWT(k-
1, k)=n × k × ms-nck×maxCwi(!Ck);
(7.5) wipe out nckCandidate C for 0k;
(7.6) to remaining candidate's k_ item collection Ck, calculate CkSupport FTISup (Ck), if FTISup is (Ck) ms, then
From candidate's k_ item collection CkMiddle excavation k_ frequent item set Lk, and it is added to complete weighted feature word frequent item set set L, FTISup (Ck)
Computing formula be:
(7.7) if it is empty set that k is more than candidate length threshold or candidate's k_ item collection, excavate and terminate, otherwise, continue
Circulation step (7.1) to (7.6);
(8) excavate the Feature Words containing inquiry lexical item from complete weighted feature word frequent item set set L and weight pass completely
Connection rule, builds all-weighted association storehouse;Method includes step (8.1) to (8.4):
(8.1) extract a certain i_ frequent item set tlL of weighting completely from complete weighted feature word frequent item set set Li, look for
Go out tlLiAll proper subclass;
(8.2) from tlLiProper subclass set in arbitrarily take out two proper subclass tlI1And tlI2, whenAnd
And tlI1∪tlI2=LiIf, FTARConf (tlI1→tlI2) mc, then excavate complete weighted feature word Strong association rule
tlI1→tlI2;If FTARConf is (tlI2→tlI1) mc, then excavate complete weighted feature word Strong association rule tlI2→
tlI1;Described mc is minimal confidence threshold, tlI1And tlI2For complete weighted feature word frequent item set, it is tlLiVery son
Collection item collection, FTARConf (tlI1→tlI2) it is complete weighted feature word association rule tlI1→tlI2Confidence level, it calculates public
Formula is:
Wherein, FTISup (Li)
For complete weighted frequent items LiSupport, FTISup (tlI1) it is complete weighted frequent items tlI1Support;
(8.3) circulation carries out step (8.2), until weighting i_ frequent item set tlL completelyiProper subclass set in each is true
Subset is all removed once, and is only capable of taking out once, then proceed to step (8.4);
(8.4) circulation carries out step (8.1) to step (8.3), the item in complete weighted feature word frequent item set set L
Collection is all removed once, and is only capable of taking out once, then excavate and terminate;
(9) extract across the language extension word related to former inquiry from all-weighted association storehouse, build extension dictionary;
(10) former inquiry and extension word combination are submitted to search engine and retrieve again and obtain final retrieval result Chinese literary composition
Shelves;
(11) final retrieval result Chinese document submission machine translation module is translated as Indonesian document, finally will be final
Retrieval result Chinese document and final retrieval result Indonesian document return to user.
3rd, as shown in Figures 2 to 4 it is adaptable to the present embodiment merges across the language inspection of Indonesia's Chinese of association mode and user feedback
The searching system of Suo Fangfa, including following 4 modules and 3 data bases:
Machine translation module:This module use must answer machine translation interface, i.e. Microsoft TranslatorAPI, uses
In being Chinese Query by Indonesian user's query translation, and final retrieval result Chinese document is translated as Indonesian document carries
Give user;
Search engine module:This module is search engine, is examined on the internet for the Chinese Query formula after paginal translation
Rope, obtains across language initial survey set of result documents;
Weighted association pattern excavates and user's relevant feedback module completely:For across language for prostatitis r piece initial survey result is civilian
User submitted to by shelves collection, by user, these documents is carried out with dependency and judges and determine initial survey relevant documentation data base, then adopts
With all-weighted association digging technology to initial survey relevant documentation database mining expansion word associated with the query, realize across language
Retrieval obtains final retrieval result Chinese document again for speech query expansion, expansion word and former inquiry combination;
Final result display module:It is translated as printing for final retrieval result Chinese document is submitted to machine translation module
Buddhist nun's Chinese language shelves, and final retrieval result Chinese document and final retrieval result Indonesian document are returned user;
Initial survey relevant documentation data base;
All-weighted association storehouse;
Extension dictionary.
Wherein, described complete weighted association pattern excavates and user's relevant feedback module includes following 5 modules:
User clicks on behavior relevant feedback extraction module:For catch user browse produced during initial survey set of result documents
Profile download behavior, extracts the initial survey document structure user feedback set of relevant documents that user downloads;
Document pretreatment module:For user feedback set of relevant documents is carried out Chinese word segmentation, removes stop words, calculates spy
The pretreatment levied word weights and extract Feature Words, builds initial survey relevant documentation data base;
All-weighted association excavates module:For all-weighted association is carried out to initial survey relevant documentation data base
Excavate, excavate the complete weighted feature lexical item frequent item set containing former inquiry lexical item and association rule model, build and weight completely
Correlation rule storehouse;
Across language inquiry expansion word generation module:Related to former inquiry for extracting from all-weighted association storehouse
Expansion word, builds extension dictionary;
Module is realized in across language inquiry extension:For extracting Chinese expansion word from extension dictionary, by expansion word with former look into
Inquiry is combined into new inquiry, submits to search engine again and retrieves in the Internet, obtains final retrieval result Chinese document.
4th, combine technical scheme, below by experiment, beneficial effects of the present invention are described further:
Because the research range of search engine is wide and factor to be considered is relatively more, the present invention is changed to empty based on vector
Between model Indonesia in carry out in cross-language retrieval system, therefore, this experiment is a simulation experiment.Write the inventive method and
The source program of system carries out the experiment of the present invention.The international evaluation and test of multi-lingual process sponsored using Japan Information information research
The Chinese language material of the cross-language information retrieval normal data test set NTCIR-5CLIR in meeting is as this experiment language material.
NTCIR-5CLIR has query set, wen chang qiao district collection and result set, and wherein, query set has 50 inquiry themes, point
There are TITLE, DESC, NARR and CONC etc. 4 type, this paper experimental selection TITLE and DESC type, TITLE type queries master
Topic is briefly described with noun and nominal phrase, belongs to short inquiry, DESC type is to briefly describe inquiry master with sentential form
Topic, belongs to long inquiry.Its result set has 2 kinds of evaluation criterions such as Rigid and Relax, Rigid standard refer to its answer be all with former
Inquiry height correlation or correlation, Relax standard refer to height correlation, related or partly related.
In order to carry out the experiment of Cross-Language Infomation Retrieval Models in this paper Indonesia, invitation body translation technical translator personage will
50 inquiry theme human translations of NTCIR-5CLIR Chinese edition are inquired about for Indonesian.
In testing herein, the Chinese lexical analysis system write is developed using Inst. of Computing Techn. Academia Sinica
ICTCLAS to Chinese experiment language material and translates rear Chinese Query and carries out pretreatment.Feature Words weight computing adopts traditional tf-idf
Method, translates rear query term weight (wi,q) computing formula is (from document G.Salton, C.Buckley.Term-weighting
approaches in automatic text retrieval[J].Information Processing&Management,
1988,24(5):513-523.) as shown in formula (4).
Wherein, tfi,qThe original frequency occurring in query text information for query term, N is initial survey relevant documentation sum,
dfiFor comprising the initial survey relevant documentation number of i-th query term.
In this experiment, the weights method to set up of Chinese expansion word is:Using the confidence level of matrix weights correlation rule as expansion
The weights of exhibition word, when multiple correlation rules contain repetition identical query term, take its confidence level soprano as this expansion word
Weights.
Benchmark is evaluated and tested in experiment:
(1) single language retrieval benchmark (Monolingual Baseline, MB):Directly retrieve Chinese document with Chinese Query
The retrieval result obtaining.
(2) cross-language retrieval benchmark (Cross-language Baseline, CLB):Refer to the not head through any relevant feedback
Secondary cross-language retrieval result, i.e. Indonesia's inquiry retrieval result that retrieval Chinese document obtains after machine translation system translation.
(3) traditional cross-language retrieval method CLR_PRF based on pseudo-linear filter (Jianfeng Gao,
JianyunNie,Jian Zhang,et al,TREC-9CLIR Experiments atMSRCN[C].In:Proc.ofthe
9th Text Retrieval Evaluation Conference,2001:343-353.;Wu Dan, what grand celebration, Wang Huilin. base
Across language inquiry extension [J] in spurious correlation. information journal, 2010,29 (2):232-239.).In this experiment, extract across language
20 structure initial survey set of relevant documents of prostatitis initial survey document, 20 Feature Words extracting prostatitis weights (descending) are extension
Word.
The inventive method experiment parameter:Extract 100 documents in across language initial survey document prostatitis and submit to user, user is carried out
Dependency determines initial survey document sets after judging, in testing herein, the related literary composition containing known results concentration in 100, initial survey prostatitis
Shelves are considered as user's related feedback information, and extract structure user's initial survey set of relevant documents, finally, with complete weighted association rule
Then digging technology excavates expansion word to initial survey set of relevant documents and realizes query expansion.
Write source program, by the inventive method with pedestal method MB, CLB and CLR_PRF in NTCIR-5CLIR test set
On carry out across the language text retrieval of Indonesia's Chinese, compare and analyze its cross-language retrieval performance.
(1) benchmarks result
Running experiment source program, submits title part and the description of 50 inquiry themes of NTCIR-5CLIR to
Part carries out Chinese list language retrieval, Indonesia's Chinese cross-language retrieval and traditional across language inspection of Indonesia's Chinese based on pseudo-linear filter
Rope, that is, run benchmark algorithm MB, CLB and CLR_PRF, obtains 3 kinds of pedestal method retrieval experimental results as shown in table 1.
Table 1:
Table 1 test result indicate that, Indonesia Chinese cross-language retrieval benchmark CLB and traditional CLR_PRF method retrieval result
Each evaluation index value only reaches the 30% to 60% about of single language retrieval benchmark MB, long inquiry description type
Retrieval effectiveness is better than the retrieval effectiveness of short inquiry title type.For CLR_PRF algorithm, in its retrieval evaluation index, except
Outside MAP, remaining desired value increases than benchmark CLB's, increase rate be 5% to 30% about, and MAP value generally under
Fall, amplitude peak reaches %46.These results illustrate, cross-language retrieval is affected by query translation factor, and retrieval performance is generally low
Under, also do not reach its single language retrieval performance accordingly.
(2) the retrieval Performance comparision of the inventive method and benchmark algorithm
Using title type and the description type of 50 inquiry themes of NTCIR-5CLIR, support is become
Change and two kinds of situations carry out retrieving performance test during confidence level change, with Indonesia Chinese cross-language retrieval benchmark CLB and traditional
CLR_PRF method, and single language retrieval benchmark MB carries out retrieving Performance comparision.Experiment design parameter:Support threshold changes
When retrieval Performance comparision as shown in table 2, during confidence threshold value change, MAP, P 5 and P 15 of retrieval result is worth as shown in table 3.
Table 2:
Table 3:
Knowable to the experimental result of table 2, when complete weighted support measure changes of threshold, the inventive method retrieval result each
Item desired value is higher than all the value of Indonesia Chinese cross-language retrieval benchmark CLB and traditional spurious correlation cross-language retrieval method CLR_PRF,
All reach the 60% to 102% of single language retrieval benchmark MB.Compare with benchmark CLB, the amplitude that it improves is 91.55% to the maximum
(i.e. the P@5 of Rigid type is worth), minimum be 36.06% type, Relax evaluation and test P@15 be worth).With CLR_PRF method phase
The amplitude maximum that it improves is up to 244.97% (i.e. the MAP value of description query type, Rigid evaluation and test), minimum for ratio
Be 32.89%, especially, its description query type, Rigid evaluation and test MAP value met and exceeded single language
The 2% of retrieval benchmark MB.In addition, the retrieval effectiveness of inquiry theme description type is better than title type, its retrieval
The MAP value increase rate of result is maximum.
Table 3 test result indicate that, when confidence threshold value changes, the present invention obtains good retrieval result, and its item refers to
Scale value is higher than all benchmark CLB and the value of CLR_PRF algorithm, all reaches the 58.07% to 101.2% of single language retrieval benchmark MB,
The retrieval effectiveness of inquiry theme description type is also good than title type, the MAP value increase rate of its retrieval result
Maximum.
In sum, the present invention has preferable application value.
Claims (5)
1. a kind of Indonesia's Chinese cross-language retrieval method merging association mode and user feedback is it is characterised in that include following walking
Suddenly:
(1) Indonesian user inquiry is translated as Chinese Query formula by machine translation module, and is submitted to search engine mutual
Preliminary search in networking, obtains initial survey set of result documents;
(2) extract across language initial survey set of result documents prostatitis r piece Chinese document and submit to user;
(3) user carries out judgement to the Chinese document of across language initial survey set of result documents and obtains user feedback set of relevant documents, literary composition
The total record of document that shelves are concentrated is set to n;
(4) pretreatment user feedback set of relevant documents, that is, carry out Chinese word segmentation, remove stop words, calculate Feature Words weights and carry
Take the pretreatment operation of Feature Words, build initial survey relevant documentation data base;
(5) scan initial survey relevant documentation data base, excavate complete weighted feature word 1_ candidate C1, calculate C1Weight w (C1),
Statistics C1The maximum weights maxCw of project in additioni(!C1) and C1Support count nc1, ms is minimum support threshold value, calculates
The value of KIWT (1,2), the computing formula of KIWT (1,2) is:KIWT (1,2)=n × 1 × ms-nC1×maxCwi(!C1);
(6) calculate C1Support FTISup (C1), if FTISup is (C1) ms, then from 1_ candidate C1Excavate 1_ frequent item set
L1, and it is added to complete weighted feature word frequent item set set L, FTISup (C1) computing formula be:
(7) excavate k_ item collection, wherein said k 2, including step (7.1) to (7.7):
(7.1) compare candidate (k-1) _ item collection Ck-1(k-1, k) value wipe out its W (C for weights and KIWTk-1)<KIWT (k-1, k)
Candidate Ck-1;
(7.2) carry out candidate (k-1) item collection C by remainingk-1Carry out Aproiri connection, obtain Ck;
(7.3) as k=2, wipe out the candidate's 2_ item collection without query term;
(7.4) scan initial survey relevant documentation data base, count CkThe maximum weights maxCw of project in additioni(!Ck) and CkSupport
Count nck, calculate CkWeight w (Ck) and KIWT ((k-1, computing formula k) is KIWT for k-1, value k):KIWT (k-1, k)=
n×k×ms-nck×maxCwi(!Ck);
(7.5) wipe out nckCandidate C for 0k;
(7.6) to remaining candidate's k_ item collection Ck, calculate CkSupport FTISup (Ck), if FTISup is (Ck) ms, then from time
Select k_ item collection CkMiddle excavation k_ frequent item set Lk, and it is added to complete weighted feature word frequent item set set L, FTISup (Ck) meter
Calculating formula is:
(7.7) if it is empty set that k is more than candidate length threshold or candidate's k_ item collection, excavate and terminate, otherwise, continue cycling through
Step (7.1) to (7.6);
(8) the Feature Words complete weighted association rule containing inquiry lexical item are excavated from complete weighted feature word frequent item set set L
Then, build all-weighted association storehouse;
(9) extract across the language extension word related to former inquiry from all-weighted association storehouse, build extension dictionary;
(10) former inquiry and extension word combination are submitted to search engine and retrieve again and obtain final retrieval result Chinese document;
(11) final retrieval result Chinese document submission machine translation module is translated as Indonesian document, finally will finally retrieve
Result Chinese document and final retrieval result Indonesian document return to user.
2. the Indonesia's Chinese cross-language retrieval method merging association mode and user feedback according to claim 1, its feature
It is, the calculating of the Feature Words weights described in step (4) adopts tf-idf method, and its computing formula is:Wherein, tfm,nRepresent Feature Words tmIn document dnIn
Occurrence number, dfmRepresent and contain Feature Words tmNumber of documents, N represents total number of documents in collection of document.
3. the Indonesia's Chinese cross-language retrieval method merging association mode and user feedback according to claim 1, its feature
It is, the method for step (8) includes step (8.1) to (8.4):
(8.1) extract a certain i_ frequent item set tlL of weighting completely from complete weighted feature word frequent item set set Li, find out
tlLiAll proper subclass;
(8.2) from tlLiProper subclass set in arbitrarily take out two proper subclass tlI1And tlI2, whenAnd
tlI1∪tlI2=LiIf, FTARConf (tlI1→tlI2) mc, then excavate complete weighted feature word Strong association rule tlI1
→tlI2;If FTARConf is (tlI2→tlI1) mc, then excavate complete weighted feature word Strong association rule tlI2→tlI1;Institute
The mc stating is minimal confidence threshold, tlI1And tlI2For complete weighted feature word frequent item set, it is tlLiProper subclass item collection,
FTARConf(tlI1→tlI2) it is complete weighted feature word association rule tlI1→tlI2Confidence level, its computing formula is:
Wherein, FTISup (Li) it is complete
Weighted frequent items LiSupport, FTISup (tlI1) it is complete weighted frequent items tlI1Support;
(8.3) circulation carries out step (8.2), until weighting i_ frequent item set tlL completelyiProper subclass set in each proper subclass
All it is removed once, and is only capable of taking out once, then proceed to step (8.4);
(8.4) circulation carries out step (8.1) to step (8.3), when the item collection in complete weighted feature word frequent item set set L all
It is removed once, and is only capable of taking out once, then excavate and terminate.
4. a kind of inspection being applied to the Indonesia's Chinese cross-language retrieval method merging association mode and user feedback described in claim 1
Cable system it is characterised in that:Including following 4 modules and 3 data bases:
Machine translation module:This module use must answer machine translation interface, for looking into Indonesian user's query translation for Chinese
Ask, and final retrieval result Chinese document is translated as Indonesian document and submit to user;
Search engine module:This module is search engine, enters line retrieval on the internet for the Chinese Query formula after paginal translation, obtains
Arrive across language initial survey set of result documents;
Weighted association pattern excavates and user's relevant feedback module completely:For by across language for prostatitis r piece initial survey set of result documents
Submit to user, dependency is carried out by user to these documents and judges and determine initial survey relevant documentation data base, then adopted
Full weighted association rules digging technology expansion word associated with the query to initial survey relevant documentation database mining, realizes looking into across language
Ask extension, retrieval obtains final retrieval result Chinese document again for expansion word and former inquiry combination;
Final result display module:It is translated as Indonesian for final retrieval result Chinese document is submitted to machine translation module
Document, and final retrieval result Chinese document and final retrieval result Indonesian document are returned user;
Initial survey relevant documentation data base;
All-weighted association storehouse;
Extension dictionary.
5. searching system according to claim 4 is it is characterised in that described complete weighted association pattern excavates and user's phase
Close feedback module and include following 5 modules:
User clicks on behavior relevant feedback extraction module:Browse produced document during initial survey set of result documents for catching user
Download behavior, extracts the initial survey document structure user feedback set of relevant documents that user downloads;
Document pretreatment module:For user feedback set of relevant documents is carried out Chinese word segmentation, removes stop words, calculates Feature Words
Weights and the pretreatment extracting Feature Words, build initial survey relevant documentation data base;
All-weighted association excavates module:Dig for all-weighted association is carried out to initial survey relevant documentation data base
Pick, excavates the complete weighted feature lexical item frequent item set containing former inquiry lexical item and association rule model, builds weighting completely and closes
Connection rule base;
Across language inquiry expansion word generation module:For extracting the extension related to former inquiry from all-weighted association storehouse
Word, builds extension dictionary;
Module is realized in across language inquiry extension:For extracting Chinese expansion word from extension dictionary, by expansion word and former inquiry group
Synthesis is new to be inquired about, and submits to search engine again and retrieves in the Internet, obtains final retrieval result Chinese document.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610827858.4A CN106484781B (en) | 2016-09-18 | 2016-09-18 | Merge the Indonesia's Chinese cross-language retrieval method and system of association mode and user feedback |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610827858.4A CN106484781B (en) | 2016-09-18 | 2016-09-18 | Merge the Indonesia's Chinese cross-language retrieval method and system of association mode and user feedback |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106484781A true CN106484781A (en) | 2017-03-08 |
CN106484781B CN106484781B (en) | 2019-03-15 |
Family
ID=58267229
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610827858.4A Expired - Fee Related CN106484781B (en) | 2016-09-18 | 2016-09-18 | Merge the Indonesia's Chinese cross-language retrieval method and system of association mode and user feedback |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106484781B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107526839A (en) * | 2017-09-08 | 2017-12-29 | 广西财经学院 | Based on weight positive negative mode completely consequent extended method is translated across language inquiry |
CN109739953A (en) * | 2018-12-30 | 2019-05-10 | 广西财经学院 | The text searching method extended based on chi-square analysis-Confidence Framework and consequent |
CN109992644A (en) * | 2019-03-26 | 2019-07-09 | 苏州大成有方数据科技有限公司 | A kind of intellectual property type of structured text intelligent semantic reconfiguration system |
CN111125102A (en) * | 2019-12-16 | 2020-05-08 | 北京明略软件系统有限公司 | Data query method and device based on index data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103279570A (en) * | 2013-06-19 | 2013-09-04 | 广西教育学院 | Text database oriented matrix weighting negative pattern mining method |
CN104182527A (en) * | 2014-08-27 | 2014-12-03 | 广西教育学院 | Partial-sequence itemset based Chinese-English test word association rule mining method and system |
CN104216874A (en) * | 2014-09-22 | 2014-12-17 | 广西教育学院 | Chinese interword weighing positive and negative mode excavation method and system based on relevant coefficients |
-
2016
- 2016-09-18 CN CN201610827858.4A patent/CN106484781B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103279570A (en) * | 2013-06-19 | 2013-09-04 | 广西教育学院 | Text database oriented matrix weighting negative pattern mining method |
CN104182527A (en) * | 2014-08-27 | 2014-12-03 | 广西教育学院 | Partial-sequence itemset based Chinese-English test word association rule mining method and system |
CN104216874A (en) * | 2014-09-22 | 2014-12-17 | 广西教育学院 | Chinese interword weighing positive and negative mode excavation method and system based on relevant coefficients |
Non-Patent Citations (3)
Title |
---|
DEBASIS GANGULY ET AL.,: ""Cross-Lingual Topical Relevance Models"", 《24TH INTEENATIONAL CONFERENCE ON COMPUTATIONAL LINGUISITICS》 * |
XUWEN WANG ET AL.,: ""LDA BASED PSEUDO RELEVANCE FEEDBACK FOR CROSS LANGUAGE INFORMATION RETRIEVAL"", 《PROCEEDINGS OF IEEE CCIS2012》 * |
黄名选,严小卫等: ""基于矩阵加权关联规则挖掘的伪相关反馈查询扩展"", 《软件学报》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107526839A (en) * | 2017-09-08 | 2017-12-29 | 广西财经学院 | Based on weight positive negative mode completely consequent extended method is translated across language inquiry |
CN107526839B (en) * | 2017-09-08 | 2019-09-10 | 广西财经学院 | Consequent extended method is translated across language inquiry based on weight positive negative mode completely |
CN109739953A (en) * | 2018-12-30 | 2019-05-10 | 广西财经学院 | The text searching method extended based on chi-square analysis-Confidence Framework and consequent |
CN109739953B (en) * | 2018-12-30 | 2021-07-20 | 广西财经学院 | Text retrieval method based on chi-square analysis-confidence framework and back-part expansion |
CN109992644A (en) * | 2019-03-26 | 2019-07-09 | 苏州大成有方数据科技有限公司 | A kind of intellectual property type of structured text intelligent semantic reconfiguration system |
CN109992644B (en) * | 2019-03-26 | 2022-07-12 | 苏州大成有方数据科技有限公司 | Intelligent semantic reconstruction system for intellectual property structured text |
CN111125102A (en) * | 2019-12-16 | 2020-05-08 | 北京明略软件系统有限公司 | Data query method and device based on index data |
CN111125102B (en) * | 2019-12-16 | 2023-03-21 | 北京明略软件系统有限公司 | Data query method and device based on index data |
Also Published As
Publication number | Publication date |
---|---|
CN106484781B (en) | 2019-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106372241B (en) | More across the language text search method of English and the system of word-based weighted association pattern | |
CN106484781B (en) | Merge the Indonesia's Chinese cross-language retrieval method and system of association mode and user feedback | |
Qin et al. | An efficient location extraction algorithm by leveraging web contextual information | |
CN103646112A (en) | Dependency parsing field self-adaption method based on web search | |
Guo et al. | Improving candidate generation for entity linking | |
CN107609095B (en) | Based on across the language inquiry extended method for weighting positive and negative regular former piece and relevant feedback | |
Afyouni et al. | AraCap: A hybrid deep learning architecture for Arabic Image Captioning | |
CN107526839B (en) | Consequent extended method is translated across language inquiry based on weight positive negative mode completely | |
CN109684463B (en) | Cross-language post-translation and front-part extension method based on weight comparison and mining | |
CN106383883B (en) | Indonesia's Chinese cross-language retrieval method and system based on matrix weights association mode | |
CN109739952A (en) | Merge the mode excavation of the degree of association and chi-square value and the cross-language retrieval method of extension | |
Wang et al. | Chinese text keyword extraction based on Doc2vec and TextRank | |
CN109684464B (en) | Cross-language query expansion method for realizing rule back-part mining through weight comparison | |
Yang et al. | Research on improvement of text processing and clustering algorithms in public opinion early warning system | |
CN109684465B (en) | Text retrieval method based on pattern mining and mixed expansion of item set weight value comparison | |
CN108170778B (en) | Chinese-English cross-language query post-translation expansion method based on fully weighted rule post-piece | |
Liu et al. | Recognition of collocation frames from sentences | |
CN108416442B (en) | Chinese word matrix weighting association rule mining method based on item frequency and weight | |
Azzopardi et al. | Page retrievability calculator | |
Ng et al. | Data Fusion of Machine-Learning Methods for the TREC5 Routing Task (and other work). | |
Ma et al. | Selecting related terms in query-logs using two-stage simrank | |
Caon et al. | Finding synonyms and other semantically-similar terms from coselection data | |
CN108133022B (en) | Matrix weighting association rule-based Chinese-English cross-language query front piece expansion method | |
Yan et al. | Research on Sino-Tibetan Machine Translation Based on the Reusing of Domain Ontology. | |
Liubonko et al. | Matching Ukrainian Wikipedia red links with English Wikipedia’s articles |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20190315 Termination date: 20190918 |