CN109635090A - A kind of copyright method for tracing based on machine learning - Google Patents

A kind of copyright method for tracing based on machine learning Download PDF

Info

Publication number
CN109635090A
CN109635090A CN201811532787.0A CN201811532787A CN109635090A CN 109635090 A CN109635090 A CN 109635090A CN 201811532787 A CN201811532787 A CN 201811532787A CN 109635090 A CN109635090 A CN 109635090A
Authority
CN
China
Prior art keywords
article
similarity
keyword
typing
search result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811532787.0A
Other languages
Chinese (zh)
Inventor
王泽�
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Zhongchuan Puhua Technology Co Ltd
Original Assignee
Anhui Zhongchuan Puhua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Zhongchuan Puhua Technology Co Ltd filed Critical Anhui Zhongchuan Puhua Technology Co Ltd
Priority to CN201811532787.0A priority Critical patent/CN109635090A/en
Publication of CN109635090A publication Critical patent/CN109635090A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of copyright method for tracing based on machine learning, is related to Network Document copyright tracer technique field.The present invention includes: component neural network topic model, extracts keyword to the semantic analysis of user's typing article;Using keyword as the input parameter of search engine, search result set is obtained;By crawler algorithm, obtains search result and search result is concentrated to correspond to the target article in webpage;The similarity of the target article and user's typing article content in webpage is calculated by Word2Vec algorithm model.The present invention is by obtaining the keyword high with pre- publication article Topic Similarity, and the target article with keyword match is obtained in internet site by keyword, the pre- publication article of similarity judgement for issuing article and target article more in advance finally by content of text comparison algorithm whether there is infringement of copyright, facilitates operation, improves internet article copyright tracking efficiency and improves the accuracy rate and confidence level of the pre- similarity for issuing article and target article.

Description

A kind of copyright method for tracing based on machine learning
Technical field
The invention belongs to Network Document copyright tracer technique fields, chase after more particularly to a kind of copyright based on machine learning Track method.
Background technique
As internet plays an increasingly important role in life, the reading of article is also more and more in life People passes through internet site web page browing article.Due to the opening of internet, many articles are quite similar or even some articles It is related to copyright problem.This results in needing to carry out infringement of copyright retrieval, in order to from interconnection before internet article is issued Retrieval constitutes copyright infringement for judging whether with the presence or absence of with the quite similar article of the article to be issued in online article library.
The content that the prior art is generally based on the internet article downloaded carries out similarity-rough set, does not make Use crawler related algorithm, complex steps, efficiency are lower, article contrast base figure it is small cause it is with a low credibility.And traditional text What content comparison algorithm utilized is traditional algorithm, and article segmentation is formed a complete sentence, then using literal apart from relevant calculation formula, The matching degree between sentence is calculated, to obtain the similarity between article, the rank of such similarity comparison is limited in sentence Sub- rank, although many articles are lower in sentence level similarity, but the content similarity that article itself is expressed is very high, Cause the confidence level of similarity lower, there are large errors.
This invention address that a kind of copyright method for tracing based on machine learning is researched and developed, for solving existing internet text The problem with a low credibility that chapter copyright method for tracing efficiency is lower, article similarity judges.
Summary of the invention
The purpose of the present invention is to provide a kind of copyright method for tracing based on machine learning passes through and obtains and pre- publication text The high keyword of chapter Topic Similarity, and the target article with keyword match is obtained in internet site by keyword, most The pre- publication article of similarity judgement for issuing article and target article more in advance by content of text comparison algorithm afterwards whether there is It is lower, the judgement of article similarity with a low credibility to solve existing internet article copyright method for tracing efficiency for infringement of copyright Problem.
In order to solve the above technical problems, the present invention is achieved by the following technical solutions:
The present invention is a kind of copyright method for tracing based on machine learning, is included the following steps:
Step 1: component neural network topic model is combined using TF-IDF algorithm with TextRank algorithm, by right Keyword is extracted in the semantic analysis of user's typing article;
Step 2: using the keyword as the input parameter of search engine, search result set is obtained;
Step 3: it by crawler algorithm, obtains search result in described search result set and corresponds to the target article in webpage;
Step 4: the target article and user's typing article content in webpage are calculated by Word2Vec algorithm model Similarity;Judge whether user's typing article is constituted target article according to the similarity of user's typing article and target article Infringement of copyright.
Preferably, detailed process is as follows in step 1:
By the content participle to the typing article, the candidate keywords of typing article are obtained according to part of speech;
Learnt to obtain topic model according to large-scale corpus, calculates the theme distribution and candidate word point of the typing article Cloth;
Calculate the theme of typing article and Topic Similarity and the sequence of candidate keywords;
Chosen from high to low according to Topic Similarity similarity it is higher several as keyword, it is general to choose 10, Specific number will be determined according to the number of candidate key.
Preferably, step 3 further includes following process:
A classifier is obtained by training, for extracting target information, such as: body matter, issuing time;
Wherein, when carrying out target information extraction, the advertisement and additional information in HTML are rejected.
Preferably, step 4 comprises the following processes:
The two-way LSTM network of deep Siam based on keras, phrase/sentence similitude is captured using word insertion;Pass through meter The cosine value for calculating two term vectors calculates the similarity of corresponding word corresponding with two term vectors.
The invention has the following advantages:
1, the present invention keyword high with pre- publication article Topic Similarity by acquisition, and by keyword in internet Website obtains the target article with keyword match, issues article and target text more in advance finally by content of text comparison algorithm The pre- publication article of similarity judgement of chapter whether there is infringement of copyright, facilitates operation, improves internet article copyright tracking efficiency And improve the accuracy rate and confidence level of the similarity of pre- publication article and target article.
2, for the present invention in the similarity comparison of publication article and target article in advance, deep Siam based on keras is two-way LSTM network captures phrase/sentence similitude using word insertion, improves the accuracy rate of similarity comparison judgement.
Certainly, it implements any of the products of the present invention and does not necessarily require achieving all the advantages described above at the same time.
Detailed description of the invention
In order to illustrate the technical solution of the embodiments of the present invention more clearly, will be described below to embodiment required Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is a kind of flow chart of copyright method for tracing based on machine learning of the invention;
Fig. 2 is the flow chart that jieba participle is realized in the present invention;
Fig. 3 is neural network model realization principle figure in the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts all other Embodiment shall fall within the protection scope of the present invention.
Refering to Figure 1, the present invention is a kind of copyright method for tracing based on machine learning, include the following steps:
Step 1: component neural network topic model is combined using TF-IDF algorithm with TextRank algorithm, by right Keyword is extracted in the semantic analysis of user's typing article;
Step 2: using keyword as the input parameter of search engine, search result set is obtained;
Step 3: it by crawler algorithm, obtains search result and search result is concentrated to correspond to the target article in webpage;
Step 4: the phase of the target article and user's typing article content in webpage is calculated by Word2Vec algorithm model Like degree.
Wherein, detailed process is as follows in step 1:
By the content participle to typing article, the candidate keywords of typing article are obtained according to part of speech;
Learnt to obtain topic model according to large-scale corpus, calculates theme distribution and the candidate word distribution of typing article;
Calculate the theme of typing article and Topic Similarity and the sequence of candidate keywords;
Chosen from high to low according to Topic Similarity similarity it is higher several as keyword.
Wherein, step 3 further includes following process:
A classifier is obtained by training, for extracting target information;
Wherein, when carrying out target information extraction, the advertisement and additional information in HTML are rejected.
Wherein, step 4 comprises the following processes:
The two-way LSTM network of deep Siam based on keras, phrase/sentence similitude is captured using word insertion;Pass through meter The cosine value for calculating two term vectors calculates the similarity of corresponding word corresponding with two term vectors.
Specific embodiment one:
Step 1: assuming that the pre- publication article of user's input is as follows:
During Zhuhai airplane exhibition of early November, the China's Space station core nacelle exhibition section that external disclosure exposes for the first time has welcome one The special spectators in position, he is exactly space flight hero Yang Liwei.
On October 15th, 2003, Yang Liwei bears motherland and the great trust of the people is gone on an expedition, and goes to explore space."10,9, 8 ... " when commander's countdown password transmits, Yang Liwei can't help having lifted the right hand, respect one to motherland and the people The military salute of a solemnity.Whole when 9, in earsplitting roar, rocket is rised sheer from level ground, and is carried Yang Liwei and is flown to space.
Rocket speed is getting faster, escape tower separation, Booster separation, I and II Separation, radome fairing separation ... Just in a flash, Yang Liwei feels that the body of oneself seems to float up suddenly, he is, it is realized that airship has been detached from the earth draws Power has come space.It looks through porthole, for the beautiful earth after light cloud layer, long coastline is clear and legible.Airship around Earth high-speed flight, 90 minutes one enclose, daytime and night alternating, one of beautiful Phnom Penh has seemingly been inlayed at earth edge.Yang Li It is big to have write such a word: " for peace and the progress of the mankind, Chinese have come space ", and the Xiang Zuguo before the inner lens of cabin The people, the people of the world show.
Space flight 14 is enclosed, and lasts 21 hours 23 points, and Yang Liwei drives Shenzhou 5 spacecraft and has been finally completed first Chinese Manned space flight indicates Chinese people in the journey for climbing world technology peak and has stepped essential step.From divine boat No. five manned spaceships, Long March II F type carrier rocket are returned to space launch, observing and controlling, astronaut training and guarantee and airship Return capsule after returning the earth searches troop etc., and the behind lifted Yang Liwei's flying apsaras, obtain first manned spaceflight triumph is into It the unremitting effort of thousand astronauts up to ten thousand and pays hard.
Since the figure of Chinese has been stayed in immense space by Yang Liwei, taikonaut troop connects flight, so far Have 11 spacefarer, 14 person-times executed 6 manned space missions, stroke flies up to more than 4600 Wan Gongli around the earth 1000 multi-turns, flown out in space again and again Chinese height.It triumphantly goes on an expedition again and again, the firmament of cruising again and again, Chinese visit The step of rope space is more and more remoter.
Nowadays, China Aerospace has entered the space station epoch, and the new development of aerospace industry, which proposes spacefarer troop, newly to be wanted It asks.The choice work of third batch spacefarer, in addition to traditional driver, also increases engineer and load in spacefarer's type Lotus expert.
" I say the spacefarer newly to join the team, has not had to the problem for discussing that a people flies on earth now, but one The problem of people can fly several times." Yang Liwei says that, once constructed space station, needs taikonaut to be resident for a long time, this is to spacefarer Quantity and quality be proposed new requirement.Following space mission can be more and more intensive, and spacefarer will fly to deeper farther Space.
Particular content is with reference to as follows:
http://politics.people.com.cn/n1/2018/1128/c1001-30428809.html。
Step 2: as a result as follows to pre- publication article keyword extraction:
(Zhuhai airplane exhibition 0.1479898), (Yang Liwei 0.08185993), (commander 0.037163872), (China's Space Stand 0.03713809), (people 0.030622173), (cabin exhibition section 0.029190885), (space 0.019004112), (core 0.01749658), (spectators 0.01749658), (space flight hero 0.01749658), (right hand 0.017248483), (motherland 0.017134596);
Wherein, the left side is keyword in bracket, and the right is the weighted value of opposite article content central idea, and numerical value is bigger, The core content of article can more be protruded.
Algorithm realizes that core procedure is participle and keyword extraction;
Wherein, it please refers to shown in Fig. 2, participle is segmented by jieba and realized;
Keyword extraction includes the following:
It is realized by TF-IDF algorithm;
TF: indicating importance of the word in article, and general core word can repeatedly occur in article;
IDF: indicating the discrimination of word, and professional word occurs fewer in entire corpus, can more be associated with article theme.Specifically Calculation formula is as follows:
The number IDF=log2 (total number of documents/number of documents+1 comprising the word) that the TF=word occurs in a document;
TF-IDF=TF*IDF.
Step 3:
Parameter of a part of content as search engine API is extracted from the keyword of article:
Above search parameter can be expressed as " Zhuhai airplane exhibition+Yang Liwei+commander+China's Space station+people ";
The web site url of search result content is as follows:
politics.people.com.cn/n1/2018/1128/c1001-30428616.html;
news.xhby.net/system/2018/11/28/030900255.shtml;
hn.people.com.cn/n2/2018/1024/c338398-32195524.html;
mil.news.sina.com.cn/china/2018.../doc-ifwnpcnt8635417.shtml;
Https: //user.guancha.cn/main/content? id=51351&s=fwtjgzwz;
news.cctv.com/.../ARTIvIYpw0dnf0eMl8TGdMnw181024.shtml;
www.81.cn/jmywyl/2018-10/24/content_9321349.html;
Www.taikongmedia.com/Item/Show.asp? m=1&d=25553;
https://crossasia.org/en.html;
scitech.people.com.cn/n/2014/1113/c1007-26017793.html。
Step 4: the content of search website: (website of crawling is as follows) is crawled
http://news.xhby.net/system/2018/11/28/030900255.shtml
Its content is as follows:
During Zhuhai airplane exhibition of early November, the China's Space station core nacelle exhibition section that external disclosure exposes for the first time has welcome one The special spectators in position, he is exactly space flight hero Yang Liwei.
On October 15th, 2003, Yang Liwei bears motherland and the great trust of the people is gone on an expedition, and goes to explore space."10,9, 8 ... " when commander's countdown password transmits, Yang Liwei can't help having lifted the right hand, respect one to motherland and the people The military salute of a solemnity.Whole when 9, in earsplitting roar, rocket is rised sheer from level ground, and is carried Yang Liwei and is flown to space.
Rocket speed is getting faster, escape tower separation, Booster separation, I and II Separation, radome fairing separation ... Just in a flash, Yang Liwei feels that the body of oneself seems to float up suddenly, he is, it is realized that airship has been detached from the earth draws Power has come space.It looks through porthole, for the beautiful earth after light cloud layer, long coastline is clear and legible.Airship around Earth high-speed flight, 90 minutes one enclose, daytime and night alternating, one of beautiful Phnom Penh has seemingly been inlayed at earth edge.Yang Li It is big to have write such a word: " for peace and the progress of the mankind, Chinese have come space ", and the Xiang Zuguo before the inner lens of cabin The people, the people of the world show.
Space flight 14 is enclosed, and lasts 21 hours 23 points, and Yang Liwei drives Shenzhou 5 spacecraft and has been finally completed first Chinese Manned space flight indicates Chinese people in the journey for climbing world technology peak and has stepped essential step.From divine boat No. five manned spaceships, Long March II F type carrier rocket are returned to space launch, observing and controlling, astronaut training and guarantee and airship Return capsule after returning the earth searches troop etc., and the behind lifted Yang Liwei's flying apsaras, obtain first manned spaceflight triumph is into It the unremitting effort of thousand astronauts up to ten thousand and pays hard.
Since the figure of Chinese has been stayed in immense space by Yang Liwei, taikonaut troop connects flight, so far Have 11 spacefarer, 14 person-times executed 6 manned space missions, stroke flies up to more than 4600 Wan Gongli around the earth 1000 multi-turns, flown out in space again and again Chinese height.It triumphantly goes on an expedition again and again, the firmament of cruising again and again, Chinese visit The step of rope space is more and more remoter.
Nowadays, China Aerospace has entered the space station epoch, and the new development of aerospace industry, which proposes spacefarer troop, newly to be wanted It asks.The choice work of third batch spacefarer, in addition to traditional driver, also increases engineer and load in spacefarer's type Lotus expert.
" I say the spacefarer newly to join the team, has not had to the problem for discussing that a people flies on earth now, but one The problem of people can fly several times." Yang Liwei says that, once constructed space station, needs taikonaut to be resident for a long time, this is to spacefarer Quantity and quality be proposed new requirement.Following space mission can be more and more intensive, and spacefarer will fly to deeper farther Space.
The process is based on the open source library java Boilerpipe and realizes, the basic thought of algorithm is to obtain one by training Classifier come extract we needs information.
Step 5: search result web page and user input article content similarity-rough set
Text similarity identify the two-way LSTM network implementations of deep Siam based on keras, captured using word insertion phrase/ Sentence similitude.
It please refers to shown in Fig. 3, is neural network model realization principle figure;
Finally, realizing the right-safeguarding evidence to search result website screenshot, as copyright tracking by phantomjs.
It is worth noting that, included each unit is only drawn according to function logic in the above system embodiment Point, but be not limited to the above division, as long as corresponding functions can be realized;In addition, each functional unit is specific Title is also only for convenience of distinguishing each other, the protection scope being not intended to restrict the invention.
In addition, those of ordinary skill in the art will appreciate that realizing all or part of the steps in the various embodiments described above method It is that relevant hardware can be instructed to complete by program, corresponding program can store to be situated between in a computer-readable storage In matter.
Present invention disclosed above preferred embodiment is only intended to help to illustrate the present invention.There is no detailed for preferred embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to better explain the present invention Principle and practical application, so that skilled artisan be enable to better understand and utilize the present invention.The present invention is only It is limited by claims and its full scope and equivalent.

Claims (4)

1. a kind of copyright method for tracing based on machine learning, which comprises the steps of:
Step 1: component neural network topic model is combined using TF-IDF algorithm with TextRank algorithm, by user Keyword is extracted in the semantic analysis of typing article;
Step 2: using the keyword as the input parameter of search engine, search result set is obtained;
Step 3: it by crawler algorithm, obtains search result in described search result set and corresponds to the target article in webpage;
Step 4: the phase of the target article and user's typing article content in webpage is calculated by Word2Vec algorithm model Like degree.
2. a kind of copyright method for tracing based on machine learning according to claim 1, which is characterized in that have in step 1 Body process is as follows:
By the content participle to the typing article, the candidate keywords of typing article are obtained according to part of speech;
Learnt to obtain topic model according to large-scale corpus, calculates theme distribution and the candidate word distribution of the typing article;
Calculate the theme of typing article and Topic Similarity and the sequence of candidate keywords;
Chosen from high to low according to Topic Similarity similarity it is higher several as keyword.
3. a kind of copyright method for tracing based on machine learning according to claim 1, which is characterized in that step 3 is also wrapped Include following process:
A classifier is obtained by training, for extracting target information;
Wherein, when carrying out target information extraction, the advertisement and additional information in HTML are rejected.
4. a kind of copyright method for tracing based on machine learning according to claim 1, which is characterized in that step 4 includes Following process:
The two-way LSTM network of deep Siam based on keras, phrase/sentence similitude is captured using word insertion;By calculating two The cosine value of a term vector calculates the similarity of corresponding word corresponding with two term vectors.
CN201811532787.0A 2018-12-14 2018-12-14 A kind of copyright method for tracing based on machine learning Pending CN109635090A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811532787.0A CN109635090A (en) 2018-12-14 2018-12-14 A kind of copyright method for tracing based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811532787.0A CN109635090A (en) 2018-12-14 2018-12-14 A kind of copyright method for tracing based on machine learning

Publications (1)

Publication Number Publication Date
CN109635090A true CN109635090A (en) 2019-04-16

Family

ID=66074021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811532787.0A Pending CN109635090A (en) 2018-12-14 2018-12-14 A kind of copyright method for tracing based on machine learning

Country Status (1)

Country Link
CN (1) CN109635090A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264351A (en) * 2019-05-15 2019-09-20 阿里巴巴集团控股有限公司 Copyright distribution method and device based on block chain
CN111488555A (en) * 2020-04-02 2020-08-04 上海七印信息科技有限公司 Copyright authentication method and device, computer equipment and storage medium
CN112000929A (en) * 2020-07-29 2020-11-27 广州智城科技有限公司 Cross-platform data analysis method, system, equipment and readable storage medium
CN113064979A (en) * 2021-03-10 2021-07-02 国网河北省电力有限公司 Keyword retrieval-based method for judging construction period and price reasonability
US11093650B2 (en) 2019-05-15 2021-08-17 Advanced New Technologies Co., Ltd. Blockchain-based copyright distribution

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150206101A1 (en) * 2014-01-21 2015-07-23 Our Tech Co., Ltd. System for determining infringement of copyright based on the text reference point and method thereof
CN107633020A (en) * 2017-08-24 2018-01-26 新译信息科技(深圳)有限公司 Article similarity detection method and device
CN107644010A (en) * 2016-07-20 2018-01-30 阿里巴巴集团控股有限公司 A kind of Text similarity computing method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150206101A1 (en) * 2014-01-21 2015-07-23 Our Tech Co., Ltd. System for determining infringement of copyright based on the text reference point and method thereof
CN107644010A (en) * 2016-07-20 2018-01-30 阿里巴巴集团控股有限公司 A kind of Text similarity computing method and device
CN107633020A (en) * 2017-08-24 2018-01-26 新译信息科技(深圳)有限公司 Article similarity detection method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汪润等: ""DeepRD:基于 Siamese LSTM 网络的 Android 重打包应用检测方法"", 《通信学报》, 25 August 2018 (2018-08-25), pages 1 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264351A (en) * 2019-05-15 2019-09-20 阿里巴巴集团控股有限公司 Copyright distribution method and device based on block chain
CN112651836A (en) * 2019-05-15 2021-04-13 创新先进技术有限公司 Copyright distribution method and device based on block chain
US11093650B2 (en) 2019-05-15 2021-08-17 Advanced New Technologies Co., Ltd. Blockchain-based copyright distribution
CN111488555A (en) * 2020-04-02 2020-08-04 上海七印信息科技有限公司 Copyright authentication method and device, computer equipment and storage medium
CN112000929A (en) * 2020-07-29 2020-11-27 广州智城科技有限公司 Cross-platform data analysis method, system, equipment and readable storage medium
CN113064979A (en) * 2021-03-10 2021-07-02 国网河北省电力有限公司 Keyword retrieval-based method for judging construction period and price reasonability

Similar Documents

Publication Publication Date Title
CN109635090A (en) A kind of copyright method for tracing based on machine learning
Heise Science Fiction and the Time Scales of the Anthropocene
DeLoughrey Satellite Planetarity and the Ends of the Earth
Shen et al. Remote sensing image caption generation via transformer and reinforcement learning
Galina et al. Method for generating subject area associative portraits: different examples
Wen et al. Vision-language models in remote sensing: Current progress and future trends
Zhao A systematic survey of remote sensing image captioning
Keys et al. Visions of the Arctic Future: Blending Computational Text Analysis and Structured Futuring to Create Story‐Based Scenarios
Hollink et al. A corpus of images and text in online news
Tonja et al. Natural language processing in ethiopian languages: Current state, challenges, and opportunities
Sheehan et al. Learning to interpret satellite images using wikipedia
Melas-Kyriazi et al. Generation-distillation for efficient natural language understanding in low-data settings
Budíková et al. DISA at ImageCLEF 2014: The Search-based Solution for Scalable Image Annotation.
CN108984520A (en) Stratification text subject dividing method
Ferrés Domènech Knowledge-based and data-driven approaches for geographical information access
Luker JH Prynne’s moral cosmology
Devyatkin et al. An information retrieval system for decision support: an arctic-related mass media case study
Hou et al. Web image search by automatic image annotation and translation
Wang et al. A military named entity relation extraction approach based on deep learning
CN104933192A (en) Automatic Chinese and Filipino bilingual parallel text collection system and implementation method
Bates Alien intrusion
Das et al. Abid: Attention-based bengali image description
Iftene et al. Diversification in an image retrieval system based on text and image processing
Benaissa et al. Characters Type Recognition In Moroccan Documents Using CNN
Azunre et al. Abstractive Tabular Dataset Summarization via Knowledge Base Semantic Embeddings

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190416