CN111859887A - Scientific and technological news automatic writing system based on deep learning - Google Patents

Scientific and technological news automatic writing system based on deep learning Download PDF

Info

Publication number
CN111859887A
CN111859887A CN202010707063.6A CN202010707063A CN111859887A CN 111859887 A CN111859887 A CN 111859887A CN 202010707063 A CN202010707063 A CN 202010707063A CN 111859887 A CN111859887 A CN 111859887A
Authority
CN
China
Prior art keywords
news
scientific
module
deep learning
technological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010707063.6A
Other languages
Chinese (zh)
Inventor
刘超
刘霖雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou chaos Information Technology Co.,Ltd.
Original Assignee
Beijing Beidou Tianxun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Beidou Tianxun Technology Co Ltd filed Critical Beijing Beidou Tianxun Technology Co Ltd
Priority to CN202010707063.6A priority Critical patent/CN111859887A/en
Publication of CN111859887A publication Critical patent/CN111859887A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a scientific and technological news automatic writing system based on deep learning, which relates to the technical field of news writing and comprises a web crawler module; a scientific and technological news preprocessing module; a scientific and technological news classification clustering module; a scientific and technological news deep learning generation training module; a news automatic generation module; the news display module is generated, so that the rapid generation of scientific and technical news is realized, and news forms of different styles can be generated according to different website styles and the like.

Description

Scientific and technological news automatic writing system based on deep learning
Technical Field
The invention relates to the technical field of news writing, in particular to an automatic scientific and technological news writing system based on deep learning, which is used for information processing and news manuscript writing of scientific and technological news.
Background
News works have many categories, such as civil life, current affairs, military affairs and the like, and the news of the internet is gradually increased in various columns or layouts seen in newspaper ends, so that various news websites are in endless numbers.
The scientific and technological news is the report of the special scientific and technological fact which occurs recently, and the scientific and technological news is mostly conference news, the materials are mostly conference draft and related reports, and the material is rarely special interview, so the material is very important. For science and technology news, the requirements of the reporter are biased to the rational thinking of the science and technology. With the development of the internet, scientific characters and scientific events, the number of related reports of the scientific class is increased every day, and relatively more and more news reports are provided for the scientific class, so that the cost of the news reports is increased.
Therefore, in order to reduce the reporting cost of science and technology news, the latest research result of deep mind is found, and the performance of the deep learning artificial network RNN widely used in the fields of voice recognition, image recognition, semantic understanding and the like is remarkably improved (substentially beter). The research is mainly enhanced by using external memory on the memory time sequence generation model, and the method has certain inspiration on the research in the field of deep learning.
On the basis of summarizing and analyzing scientific news contents written by human authors, the invention discloses a machine learning writing system realized by a machine learning method.
Disclosure of Invention
The invention aims to: in order to realize the rapid generation of scientific and technological news and generate news forms with different styles according to different website styles and the like, the invention provides an automatic scientific and technological news generating and writing system based on mass data large-scale training deep learning.
The invention specifically adopts the following technical scheme for realizing the purpose: the technical news automatic writing system based on deep learning is characterized by comprising the following modules:
A web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
the scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
the automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
And a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
Further, the scientific and technological news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
a news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
The news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture. Further, the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
Further, the crf + + is implemented in a c + + language, a large amount of stl data structures are applied, and c language is used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code, which may be specifically expressed as: in the tagger. cpp source code file, a vector < constchar > structure is used:
Figure BDA0002595208930000031
In addition, after the characteristics are coded and the memory is not released, the invention replaces the std (vector < vector _ char > > TaggerImpl (x) _, the memory forced release immediately obtains 10% of memory reduction through experimental comparison, and the L-BFGS algorithm is an improvement on the quasi-Newton algorithm aiming at the modification of CRF + + and L-BFGS. Its name has told us that it is an improvement of the BFGS algorithm based on the quasi-newton method. The basic idea of the L-BFGS algorithm is as follows: the algorithm only stores and utilizes curvature information of the latest m iterations to construct an approximate matrix of the Hessian matrix, the step length is optimized in the iteration direction, the step length is automatically adjusted according to the specific training content effect, and the training effect is effectively guaranteed not to be too cheap and too large.
Furthermore, the news named entity recognition module performs corpus training and recognizes the name of a person, a place, an organization name, a product name, a professional noun, the occurrence time and the like of news by adopting a crf + + model. Automatic system of writing of news, its characterized in that: the news automatic generation module comprises a user interaction module and a news generation module.
Furthermore, the user interaction module is mainly used for automatically searching for learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technical paper by a user, learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
The invention has the following beneficial effects:
1. the invention writes a learning source, comes from the Internet, adopts the crawler with extremely strong universality to collect scientific and technological news data, greatly improves the collection speed, can extract the collected content, titles, abstracts and texts, can quickly collect and store the data by configuring the collection source if a new news type is found, solves the problem that the scientific and technological news writing needs a large amount of manpower and material resources, and reduces the labor cost and the time cost.
2. In the preprocessing stage of news contents, the autonomous research and development intelligent word segmentation system is used, so that the word segmentation accuracy is effectively improved, and a good foundation is provided for data processing.
3. The method adopts the improved CRF + + to carry out named entity recognition and entity relation extraction, has higher recognition accuracy, carries out syntactic analysis and semantic analysis, effectively segments the content of the article, and greatly improves the final deep learning generalization learning capability.
4. The invention adopts intelligent classification and clustering algorithm, effectively aggregates the collected news, and facilitates deep learning of each writing style.
5. The invention adopts a deep learning method to learn about the writing style, writing mode, writing content characteristics, writing length, writing scene and the like of each collected and classified scientific news content, thereby generating a writing model.
6. The scientific and technological news generating system is simple to use, can quickly generate a writing manuscript only by inputting some written keywords, writing styles and small news types, and is low in writing cost and quick in writing.
7. The method can be used in the field of scientific and technological news writing, can be quickly expanded to other fields along with the enhancement of generalization learning ability, and has good popularization expansibility.
Drawings
FIG. 1 is a schematic diagram of the overall architecture of the present invention;
FIG. 2 is a content processing flow diagram of the present invention;
FIG. 3 is a diagram of the training model generation for automatically generating news in accordance with the present invention;
fig. 4 is a flow chart of an implementation of the present invention for automatically generating news.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
In the description of the embodiments of the present invention, it should be noted that the terms "inside", "outside", "upper", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally arranged when products of the present invention are used, and are only used for convenience in describing the present invention and simplifying the description, but do not indicate or imply that the devices or elements indicated must have specific orientations, be constructed in specific orientations, and operated, and thus, cannot be construed as limiting the present invention.
Example 1
As shown in fig. 1 to 4, an automatic scientific news authoring system based on deep learning is characterized by comprising the following modules:
a web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
the scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
The automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
and a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
The science and technology news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
a news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
The news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture. Further, the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
The user interaction module is mainly used for automatically searching and learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technological paper by a user, then learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
Example 2
As shown in fig. 1 to 4, in order to make the inner side occupy less, the present embodiment is further improved on the basis of embodiment 1, specifically: the crf + + is implemented by using a c + + language, a large number of stl data structures are applied, c languages are used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code, and the method specifically can be represented as follows: in the tagger. cpp source code file, a vector < constchar > structure is used:
Figure BDA0002595208930000061
in addition, after the characteristics are coded and the memory is not released, the invention replaces the std (vector < vector _ char > > TaggerImpl (x) _, the memory forced release immediately obtains 10% of memory reduction through experimental comparison, and the L-BFGS algorithm is an improvement on the quasi-Newton algorithm aiming at the modification of CRF + + and L-BFGS. Its name has told us that it is an improvement of the BFGS algorithm based on the quasi-newton method. The basic idea of the L-BFGS algorithm is as follows: the algorithm only stores and utilizes curvature information of the latest m iterations to construct an approximate matrix of the Hessian matrix, the step length is optimized in the iteration direction, the step length is automatically adjusted according to the specific training content effect, and the training effect is effectively guaranteed not to be too cheap and too large. .
Example 3:
as shown in fig. 1 to 4, the news named entity recognition module performs corpus training and recognizes names of people, places, organizations, products, professional nouns, occurrence times, and the like of news by using a crf + + model. Automatic system of writing of news, its characterized in that: the news automatic generation module comprises a user interaction module and a news generation module.

Claims (7)

1. The technical news automatic writing system based on deep learning is characterized by comprising the following modules:
a web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
The scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
the automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
and a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
2. The automatic scientific news authoring system based on deep learning of claim 1, wherein: the science and technology news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
A news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
the news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture.
3. The automatic scientific news authoring system based on deep learning of claim 2, wherein: the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
4. The automatic scientific news authoring system for deep learning according to claim 3, wherein: the crf + + is realized by using a c + + language, a large number of stl data structures are applied, and c language is used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code.
5. The automatic scientific news authoring system based on deep learning of claim 2, wherein: the news named entity recognition module is used for training corpora and recognizing the name, place name, organization name, product name, professional nouns, occurrence time and the like of news by adopting a crf + + model.
6. The automatic scientific news authoring system based on deep learning of claim 1, wherein: the news automatic generation module comprises a user interaction module and a news generation module.
7. The automatic scientific news authoring system based on deep learning of claim 6, wherein: the user interaction module is mainly used for automatically searching and learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technological paper by a user, then learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
CN202010707063.6A 2020-07-21 2020-07-21 Scientific and technological news automatic writing system based on deep learning Pending CN111859887A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010707063.6A CN111859887A (en) 2020-07-21 2020-07-21 Scientific and technological news automatic writing system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010707063.6A CN111859887A (en) 2020-07-21 2020-07-21 Scientific and technological news automatic writing system based on deep learning

Publications (1)

Publication Number Publication Date
CN111859887A true CN111859887A (en) 2020-10-30

Family

ID=73001468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010707063.6A Pending CN111859887A (en) 2020-07-21 2020-07-21 Scientific and technological news automatic writing system based on deep learning

Country Status (1)

Country Link
CN (1) CN111859887A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989811A (en) * 2021-03-01 2021-06-18 哈尔滨工业大学 BilSTM-CRF-based historical book reading auxiliary system and control method thereof
CN113590999A (en) * 2021-06-23 2021-11-02 小铁世纪(成都)科技有限公司 Adaptive content identification and release system based on small program

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199972A (en) * 2013-09-22 2014-12-10 中科嘉速(北京)并行软件有限公司 Named entity relation extraction and construction method based on deep learning
US20170147682A1 (en) * 2015-11-19 2017-05-25 King Abdulaziz City For Science And Technology Automated text-evaluation of user generated text
CN107766338A (en) * 2017-10-18 2018-03-06 北京信息科技大学 A kind of sports news automatic generation method
CN108197294A (en) * 2018-01-22 2018-06-22 桂林电子科技大学 A kind of text automatic generation method based on deep learning
CN108334626A (en) * 2018-02-12 2018-07-27 百度在线网络技术(北京)有限公司 Generation method, device and the computer equipment of news program
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system
CN109697255A (en) * 2017-10-23 2019-04-30 中国科学院沈阳自动化研究所 A kind of Personalize News jettison system and method based on automatic measure on line
CN109766410A (en) * 2019-01-07 2019-05-17 东华大学 A kind of newsletter archive automatic classification system based on fastText algorithm
CN110362674A (en) * 2019-07-18 2019-10-22 中国搜索信息科技股份有限公司 A kind of microblogging news in brief extraction-type generation method based on convolutional neural networks
CN110597981A (en) * 2019-09-16 2019-12-20 西华大学 Network news summary system for automatically generating summary by adopting multiple strategies
CN110633409A (en) * 2018-06-20 2019-12-31 上海财经大学 Rule and deep learning fused automobile news event extraction method
CN111241410A (en) * 2020-01-22 2020-06-05 深圳司南数据服务有限公司 Industry news recommendation method and terminal
CN111259143A (en) * 2020-01-15 2020-06-09 山东劳动职业技术学院(山东劳动技师学院) News automatic labeling method based on LDA model

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199972A (en) * 2013-09-22 2014-12-10 中科嘉速(北京)并行软件有限公司 Named entity relation extraction and construction method based on deep learning
US20170147682A1 (en) * 2015-11-19 2017-05-25 King Abdulaziz City For Science And Technology Automated text-evaluation of user generated text
CN107766338A (en) * 2017-10-18 2018-03-06 北京信息科技大学 A kind of sports news automatic generation method
CN109697255A (en) * 2017-10-23 2019-04-30 中国科学院沈阳自动化研究所 A kind of Personalize News jettison system and method based on automatic measure on line
CN108197294A (en) * 2018-01-22 2018-06-22 桂林电子科技大学 A kind of text automatic generation method based on deep learning
CN108334626A (en) * 2018-02-12 2018-07-27 百度在线网络技术(北京)有限公司 Generation method, device and the computer equipment of news program
CN108563620A (en) * 2018-04-13 2018-09-21 上海财梵泰传媒科技有限公司 The automatic writing method of text and system
CN110633409A (en) * 2018-06-20 2019-12-31 上海财经大学 Rule and deep learning fused automobile news event extraction method
CN109766410A (en) * 2019-01-07 2019-05-17 东华大学 A kind of newsletter archive automatic classification system based on fastText algorithm
CN110362674A (en) * 2019-07-18 2019-10-22 中国搜索信息科技股份有限公司 A kind of microblogging news in brief extraction-type generation method based on convolutional neural networks
CN110597981A (en) * 2019-09-16 2019-12-20 西华大学 Network news summary system for automatically generating summary by adopting multiple strategies
CN111259143A (en) * 2020-01-15 2020-06-09 山东劳动职业技术学院(山东劳动技师学院) News automatic labeling method based on LDA model
CN111241410A (en) * 2020-01-22 2020-06-05 深圳司南数据服务有限公司 Industry news recommendation method and terminal

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘茂福;齐乔松;胡慧君;: "基于卷积神经网络与篇章结构的足球新闻自动生成方法", 中文信息学报, no. 04, 15 April 2019 (2019-04-15) *
周凯;李芳;: "基于句子特征与模糊推断的中文突发事件摘要实现机制", 计算机应用与软件, no. 06, 15 June 2009 (2009-06-15) *
王文超;吕学强;张凯;周建设;: "足球赛事战报的自动写作研究", 北京大学学报(自然科学版), no. 02, 5 November 2017 (2017-11-05) *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112989811A (en) * 2021-03-01 2021-06-18 哈尔滨工业大学 BilSTM-CRF-based historical book reading auxiliary system and control method thereof
CN113590999A (en) * 2021-06-23 2021-11-02 小铁世纪(成都)科技有限公司 Adaptive content identification and release system based on small program

Similar Documents

Publication Publication Date Title
CN110399457B (en) Intelligent question answering method and system
CN109388795B (en) Named entity recognition method, language recognition method and system
Zubrinic et al. The automatic creation of concept maps from documents written using morphologically rich languages
CN111046656B (en) Text processing method, text processing device, electronic equipment and readable storage medium
CN109960728B (en) Method and system for identifying named entities of open domain conference information
US20080221863A1 (en) Search-based word segmentation method and device for language without word boundary tag
CN106959944A (en) A kind of Event Distillation method and system based on Chinese syntax rule
Athar Sentiment analysis of scientific citations
WO2017080090A1 (en) Extraction and comparison method for text of webpage
CN103049435A (en) Text fine granularity sentiment analysis method and text fine granularity sentiment analysis device
CN111061882A (en) Knowledge graph construction method
CN110609983A (en) Structured decomposition method for policy file
CN110795932B (en) Geological report text information extraction method based on geological ontology
Kutter Corpus analysis
CN112541337A (en) Document template automatic generation method and system based on recurrent neural network language model
CN113806531A (en) Drug relationship classification model construction method, drug relationship classification method and system
CN108763192B (en) Entity relation extraction method and device for text processing
CN111859887A (en) Scientific and technological news automatic writing system based on deep learning
CN112733547A (en) Chinese question semantic understanding method by utilizing semantic dependency analysis
CN115712700A (en) Hot word extraction method, system, computer device and storage medium
CN115455202A (en) Emergency event affair map construction method
Khan et al. Urdu word segmentation using machine learning approaches
Da et al. Deep learning based dual encoder retrieval model for citation recommendation
CN114579695A (en) Event extraction method, device, equipment and storage medium
CN113761128A (en) Event key information extraction method combining domain synonym dictionary and pattern matching

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20210723

Address after: 450000 C15-2, 50, Wutong street, Zhengzhou new hi tech Industrial Development Zone, Henan

Applicant after: Zhengzhou chaos Information Technology Co.,Ltd.

Address before: No.3511, 1st floor, building 1, No.1, Anhua street, Konggang street, Shunyi District, Beijing

Applicant before: Beijing Beidou Tianxun Technology Co., Ltd

TA01 Transfer of patent application right