CN111859887A - Scientific and technological news automatic writing system based on deep learning - Google Patents
Scientific and technological news automatic writing system based on deep learning Download PDFInfo
- Publication number
- CN111859887A CN111859887A CN202010707063.6A CN202010707063A CN111859887A CN 111859887 A CN111859887 A CN 111859887A CN 202010707063 A CN202010707063 A CN 202010707063A CN 111859887 A CN111859887 A CN 111859887A
- Authority
- CN
- China
- Prior art keywords
- news
- scientific
- module
- deep learning
- technological
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013135 deep learning Methods 0.000 title claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 16
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 230000011218 segmentation Effects 0.000 claims description 23
- 238000000034 method Methods 0.000 claims description 16
- 238000000605 extraction Methods 0.000 claims description 7
- 230000006872 improvement Effects 0.000 claims description 7
- 230000003993 interaction Effects 0.000 claims description 6
- 230000008520 organization Effects 0.000 claims description 5
- 238000012545 processing Methods 0.000 claims description 5
- 239000000284 extract Substances 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 230000006870 function Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 238000002360 preparation method Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 230000000306 recurrent effect Effects 0.000 claims description 3
- 230000007704 transition Effects 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 239000011159 matrix material Substances 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
- G06F40/186—Templates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/906—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/247—Thesauruses; Synonyms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a scientific and technological news automatic writing system based on deep learning, which relates to the technical field of news writing and comprises a web crawler module; a scientific and technological news preprocessing module; a scientific and technological news classification clustering module; a scientific and technological news deep learning generation training module; a news automatic generation module; the news display module is generated, so that the rapid generation of scientific and technical news is realized, and news forms of different styles can be generated according to different website styles and the like.
Description
Technical Field
The invention relates to the technical field of news writing, in particular to an automatic scientific and technological news writing system based on deep learning, which is used for information processing and news manuscript writing of scientific and technological news.
Background
News works have many categories, such as civil life, current affairs, military affairs and the like, and the news of the internet is gradually increased in various columns or layouts seen in newspaper ends, so that various news websites are in endless numbers.
The scientific and technological news is the report of the special scientific and technological fact which occurs recently, and the scientific and technological news is mostly conference news, the materials are mostly conference draft and related reports, and the material is rarely special interview, so the material is very important. For science and technology news, the requirements of the reporter are biased to the rational thinking of the science and technology. With the development of the internet, scientific characters and scientific events, the number of related reports of the scientific class is increased every day, and relatively more and more news reports are provided for the scientific class, so that the cost of the news reports is increased.
Therefore, in order to reduce the reporting cost of science and technology news, the latest research result of deep mind is found, and the performance of the deep learning artificial network RNN widely used in the fields of voice recognition, image recognition, semantic understanding and the like is remarkably improved (substentially beter). The research is mainly enhanced by using external memory on the memory time sequence generation model, and the method has certain inspiration on the research in the field of deep learning.
On the basis of summarizing and analyzing scientific news contents written by human authors, the invention discloses a machine learning writing system realized by a machine learning method.
Disclosure of Invention
The invention aims to: in order to realize the rapid generation of scientific and technological news and generate news forms with different styles according to different website styles and the like, the invention provides an automatic scientific and technological news generating and writing system based on mass data large-scale training deep learning.
The invention specifically adopts the following technical scheme for realizing the purpose: the technical news automatic writing system based on deep learning is characterized by comprising the following modules:
A web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
the scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
the automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
And a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
Further, the scientific and technological news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
a news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
The news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture. Further, the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
Further, the crf + + is implemented in a c + + language, a large amount of stl data structures are applied, and c language is used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code, which may be specifically expressed as: in the tagger. cpp source code file, a vector < constchar > structure is used:
In addition, after the characteristics are coded and the memory is not released, the invention replaces the std (vector < vector _ char > > TaggerImpl (x) _, the memory forced release immediately obtains 10% of memory reduction through experimental comparison, and the L-BFGS algorithm is an improvement on the quasi-Newton algorithm aiming at the modification of CRF + + and L-BFGS. Its name has told us that it is an improvement of the BFGS algorithm based on the quasi-newton method. The basic idea of the L-BFGS algorithm is as follows: the algorithm only stores and utilizes curvature information of the latest m iterations to construct an approximate matrix of the Hessian matrix, the step length is optimized in the iteration direction, the step length is automatically adjusted according to the specific training content effect, and the training effect is effectively guaranteed not to be too cheap and too large.
Furthermore, the news named entity recognition module performs corpus training and recognizes the name of a person, a place, an organization name, a product name, a professional noun, the occurrence time and the like of news by adopting a crf + + model. Automatic system of writing of news, its characterized in that: the news automatic generation module comprises a user interaction module and a news generation module.
Furthermore, the user interaction module is mainly used for automatically searching for learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technical paper by a user, learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
The invention has the following beneficial effects:
1. the invention writes a learning source, comes from the Internet, adopts the crawler with extremely strong universality to collect scientific and technological news data, greatly improves the collection speed, can extract the collected content, titles, abstracts and texts, can quickly collect and store the data by configuring the collection source if a new news type is found, solves the problem that the scientific and technological news writing needs a large amount of manpower and material resources, and reduces the labor cost and the time cost.
2. In the preprocessing stage of news contents, the autonomous research and development intelligent word segmentation system is used, so that the word segmentation accuracy is effectively improved, and a good foundation is provided for data processing.
3. The method adopts the improved CRF + + to carry out named entity recognition and entity relation extraction, has higher recognition accuracy, carries out syntactic analysis and semantic analysis, effectively segments the content of the article, and greatly improves the final deep learning generalization learning capability.
4. The invention adopts intelligent classification and clustering algorithm, effectively aggregates the collected news, and facilitates deep learning of each writing style.
5. The invention adopts a deep learning method to learn about the writing style, writing mode, writing content characteristics, writing length, writing scene and the like of each collected and classified scientific news content, thereby generating a writing model.
6. The scientific and technological news generating system is simple to use, can quickly generate a writing manuscript only by inputting some written keywords, writing styles and small news types, and is low in writing cost and quick in writing.
7. The method can be used in the field of scientific and technological news writing, can be quickly expanded to other fields along with the enhancement of generalization learning ability, and has good popularization expansibility.
Drawings
FIG. 1 is a schematic diagram of the overall architecture of the present invention;
FIG. 2 is a content processing flow diagram of the present invention;
FIG. 3 is a diagram of the training model generation for automatically generating news in accordance with the present invention;
fig. 4 is a flow chart of an implementation of the present invention for automatically generating news.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
In the description of the embodiments of the present invention, it should be noted that the terms "inside", "outside", "upper", and the like indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings or orientations or positional relationships conventionally arranged when products of the present invention are used, and are only used for convenience in describing the present invention and simplifying the description, but do not indicate or imply that the devices or elements indicated must have specific orientations, be constructed in specific orientations, and operated, and thus, cannot be construed as limiting the present invention.
Example 1
As shown in fig. 1 to 4, an automatic scientific news authoring system based on deep learning is characterized by comprising the following modules:
a web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
the scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
The automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
and a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
The science and technology news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
a news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
The news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture. Further, the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
The user interaction module is mainly used for automatically searching and learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technological paper by a user, then learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
Example 2
As shown in fig. 1 to 4, in order to make the inner side occupy less, the present embodiment is further improved on the basis of embodiment 1, specifically: the crf + + is implemented by using a c + + language, a large number of stl data structures are applied, c languages are used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code, and the method specifically can be represented as follows: in the tagger. cpp source code file, a vector < constchar > structure is used:
in addition, after the characteristics are coded and the memory is not released, the invention replaces the std (vector < vector _ char > > TaggerImpl (x) _, the memory forced release immediately obtains 10% of memory reduction through experimental comparison, and the L-BFGS algorithm is an improvement on the quasi-Newton algorithm aiming at the modification of CRF + + and L-BFGS. Its name has told us that it is an improvement of the BFGS algorithm based on the quasi-newton method. The basic idea of the L-BFGS algorithm is as follows: the algorithm only stores and utilizes curvature information of the latest m iterations to construct an approximate matrix of the Hessian matrix, the step length is optimized in the iteration direction, the step length is automatically adjusted according to the specific training content effect, and the training effect is effectively guaranteed not to be too cheap and too large. .
Example 3:
as shown in fig. 1 to 4, the news named entity recognition module performs corpus training and recognizes names of people, places, organizations, products, professional nouns, occurrence times, and the like of news by using a crf + + model. Automatic system of writing of news, its characterized in that: the news automatic generation module comprises a user interaction module and a news generation module.
Claims (7)
1. The technical news automatic writing system based on deep learning is characterized by comprising the following modules:
a web crawler module: the module collects scientific and technological channels and scientific and technological news of websites from each website, collects related contents of each scientific and technological website, extracts the text of the collected data and stores the extracted data in a database;
science and technology news preprocessing module: performing word segmentation, named entity identification, entity relation extraction, syntactic analysis and semantic analysis on the collected news;
science and technology news classification clustering module: the method mainly aims at scientific and technological news content, further refines the content with great care, adopts intelligent classification and clustering technology, carries out detailed classification on scientific and technological news and trains and learns on the news content based on a deep learning generated memory model, and finally realizes a news generation model based on the generated memory model;
The scientific and technological news deep learning generation training module: the classification system based on svm and textrnn based on deep learning simultaneously carries out an unsupervised clustering algorithm aiming at part of news with uncertain category attributes, and realizes the clustering of the content with relatively deviated attribution of the classification threshold value based on the lda automatic clustering algorithm;
the automatic news generation module: the news generation model automatically searches news contents required by the writing generation user and displays the news contents to the user as long as the user inputs key words, writing styles, time and other elements of news to be written;
and a news display module is generated: the news generated by the news automatic generation module is transmitted to a designated forum and news websites according to a designated network protocol, and is scored by a user, the quality of the news generation quality is evaluated and fed back to the fourth part, optimization and improvement are carried out continuously, and finally the content of the basic readable news of one edition is realized.
2. The automatic scientific news authoring system based on deep learning of claim 1, wherein: the science and technology news preprocessing module comprises:
the news content word segmentation submodule comprises: the method mainly aims at news texts and titles, performs complex and simple conversion on format words, unifies case and case, deletes invalid characters and the like, performs word segmentation on processed contents, and removes stop words as candidate processing data sets;
A news named entity identification module which is mainly used for identifying the name of a person, the name of a place, the name of an organization, the name of a product, a professional noun, the occurrence time and the like of news;
a news entity relation extraction module, which is mainly used for extracting and optimizing the relation entity relation among various entities aiming at various recognized nouns, wherein the entities are recognized based on a crf + + mode, and then a knowledge base is labeled according to a hownet and an entity relation established manually to extract the entity relation, so that preparation is made for the next deep learning training;
the news text content analysis module is mainly used for carrying out syntactic analysis on specific contents of news, and the syntactic structure analyzer is based on a Stanford syntactic analyzer, realizes a Chinese function, analyzes the syntactic structure of each sentence of the contents and the context relationship among the sentences, and makes a labeled sequence of the syntactic analysis;
the news text semantic analysis module analyzes and processes people, companies, scientific abbreviations, product abbreviations, company abbreviations and human-related positions of scientific news reports, replaces and expands synonyms and synonyms by using semantic resources, calculates semantic relevance by using a word2 vec-based mode, and counts partial synonyms, synonyms and related words based on the aspect of text capture.
3. The automatic scientific news authoring system based on deep learning of claim 2, wherein: the word segmentation sub-module comprises a word segmentation system, and the word segmentation system is an ansj word segmentation system based on a named entity recognition part embedded with crf + +.
4. The automatic scientific news authoring system for deep learning according to claim 3, wherein: the crf + + is realized by using a c + + language, a large number of stl data structures are applied, and c language is used for rewriting a part of code related to stl in the source code on the basis of deep reading of the source code.
5. The automatic scientific news authoring system based on deep learning of claim 2, wherein: the news named entity recognition module is used for training corpora and recognizing the name, place name, organization name, product name, professional nouns, occurrence time and the like of news by adopting a crf + + model.
6. The automatic scientific news authoring system based on deep learning of claim 1, wherein: the news automatic generation module comprises a user interaction module and a news generation module.
7. The automatic scientific news authoring system based on deep learning of claim 6, wherein: the user interaction module is mainly used for automatically searching and learning generated and related keyword sentences by using the generated writing model through inputting keywords of contents which want to generate a scientific and technological paper by a user, then learning the relation among the keywords, enabling the news contents at the most article chapter level to be in smooth transition, combining a plurality of keywords and writing styles, analyzing and decomposing by adopting a recurrent neural network, adding new trends to store and protect long-range information, and memorizing and storing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010707063.6A CN111859887A (en) | 2020-07-21 | 2020-07-21 | Scientific and technological news automatic writing system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010707063.6A CN111859887A (en) | 2020-07-21 | 2020-07-21 | Scientific and technological news automatic writing system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111859887A true CN111859887A (en) | 2020-10-30 |
Family
ID=73001468
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010707063.6A Pending CN111859887A (en) | 2020-07-21 | 2020-07-21 | Scientific and technological news automatic writing system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111859887A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112989811A (en) * | 2021-03-01 | 2021-06-18 | 哈尔滨工业大学 | BilSTM-CRF-based historical book reading auxiliary system and control method thereof |
CN113590999A (en) * | 2021-06-23 | 2021-11-02 | 小铁世纪(成都)科技有限公司 | Adaptive content identification and release system based on small program |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199972A (en) * | 2013-09-22 | 2014-12-10 | 中科嘉速(北京)并行软件有限公司 | Named entity relation extraction and construction method based on deep learning |
US20170147682A1 (en) * | 2015-11-19 | 2017-05-25 | King Abdulaziz City For Science And Technology | Automated text-evaluation of user generated text |
CN107766338A (en) * | 2017-10-18 | 2018-03-06 | 北京信息科技大学 | A kind of sports news automatic generation method |
CN108197294A (en) * | 2018-01-22 | 2018-06-22 | 桂林电子科技大学 | A kind of text automatic generation method based on deep learning |
CN108334626A (en) * | 2018-02-12 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Generation method, device and the computer equipment of news program |
CN108563620A (en) * | 2018-04-13 | 2018-09-21 | 上海财梵泰传媒科技有限公司 | The automatic writing method of text and system |
CN109697255A (en) * | 2017-10-23 | 2019-04-30 | 中国科学院沈阳自动化研究所 | A kind of Personalize News jettison system and method based on automatic measure on line |
CN109766410A (en) * | 2019-01-07 | 2019-05-17 | 东华大学 | A kind of newsletter archive automatic classification system based on fastText algorithm |
CN110362674A (en) * | 2019-07-18 | 2019-10-22 | 中国搜索信息科技股份有限公司 | A kind of microblogging news in brief extraction-type generation method based on convolutional neural networks |
CN110597981A (en) * | 2019-09-16 | 2019-12-20 | 西华大学 | Network news summary system for automatically generating summary by adopting multiple strategies |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
CN111241410A (en) * | 2020-01-22 | 2020-06-05 | 深圳司南数据服务有限公司 | Industry news recommendation method and terminal |
CN111259143A (en) * | 2020-01-15 | 2020-06-09 | 山东劳动职业技术学院(山东劳动技师学院) | News automatic labeling method based on LDA model |
-
2020
- 2020-07-21 CN CN202010707063.6A patent/CN111859887A/en active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104199972A (en) * | 2013-09-22 | 2014-12-10 | 中科嘉速(北京)并行软件有限公司 | Named entity relation extraction and construction method based on deep learning |
US20170147682A1 (en) * | 2015-11-19 | 2017-05-25 | King Abdulaziz City For Science And Technology | Automated text-evaluation of user generated text |
CN107766338A (en) * | 2017-10-18 | 2018-03-06 | 北京信息科技大学 | A kind of sports news automatic generation method |
CN109697255A (en) * | 2017-10-23 | 2019-04-30 | 中国科学院沈阳自动化研究所 | A kind of Personalize News jettison system and method based on automatic measure on line |
CN108197294A (en) * | 2018-01-22 | 2018-06-22 | 桂林电子科技大学 | A kind of text automatic generation method based on deep learning |
CN108334626A (en) * | 2018-02-12 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Generation method, device and the computer equipment of news program |
CN108563620A (en) * | 2018-04-13 | 2018-09-21 | 上海财梵泰传媒科技有限公司 | The automatic writing method of text and system |
CN110633409A (en) * | 2018-06-20 | 2019-12-31 | 上海财经大学 | Rule and deep learning fused automobile news event extraction method |
CN109766410A (en) * | 2019-01-07 | 2019-05-17 | 东华大学 | A kind of newsletter archive automatic classification system based on fastText algorithm |
CN110362674A (en) * | 2019-07-18 | 2019-10-22 | 中国搜索信息科技股份有限公司 | A kind of microblogging news in brief extraction-type generation method based on convolutional neural networks |
CN110597981A (en) * | 2019-09-16 | 2019-12-20 | 西华大学 | Network news summary system for automatically generating summary by adopting multiple strategies |
CN111259143A (en) * | 2020-01-15 | 2020-06-09 | 山东劳动职业技术学院(山东劳动技师学院) | News automatic labeling method based on LDA model |
CN111241410A (en) * | 2020-01-22 | 2020-06-05 | 深圳司南数据服务有限公司 | Industry news recommendation method and terminal |
Non-Patent Citations (3)
Title |
---|
刘茂福;齐乔松;胡慧君;: "基于卷积神经网络与篇章结构的足球新闻自动生成方法", 中文信息学报, no. 04, 15 April 2019 (2019-04-15) * |
周凯;李芳;: "基于句子特征与模糊推断的中文突发事件摘要实现机制", 计算机应用与软件, no. 06, 15 June 2009 (2009-06-15) * |
王文超;吕学强;张凯;周建设;: "足球赛事战报的自动写作研究", 北京大学学报(自然科学版), no. 02, 5 November 2017 (2017-11-05) * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112989811A (en) * | 2021-03-01 | 2021-06-18 | 哈尔滨工业大学 | BilSTM-CRF-based historical book reading auxiliary system and control method thereof |
CN113590999A (en) * | 2021-06-23 | 2021-11-02 | 小铁世纪(成都)科技有限公司 | Adaptive content identification and release system based on small program |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110399457B (en) | Intelligent question answering method and system | |
CN109388795B (en) | Named entity recognition method, language recognition method and system | |
Zubrinic et al. | The automatic creation of concept maps from documents written using morphologically rich languages | |
CN111046656B (en) | Text processing method, text processing device, electronic equipment and readable storage medium | |
CN109960728B (en) | Method and system for identifying named entities of open domain conference information | |
US20080221863A1 (en) | Search-based word segmentation method and device for language without word boundary tag | |
CN106959944A (en) | A kind of Event Distillation method and system based on Chinese syntax rule | |
Athar | Sentiment analysis of scientific citations | |
WO2017080090A1 (en) | Extraction and comparison method for text of webpage | |
CN103049435A (en) | Text fine granularity sentiment analysis method and text fine granularity sentiment analysis device | |
CN111061882A (en) | Knowledge graph construction method | |
CN110609983A (en) | Structured decomposition method for policy file | |
CN110795932B (en) | Geological report text information extraction method based on geological ontology | |
Kutter | Corpus analysis | |
CN112541337A (en) | Document template automatic generation method and system based on recurrent neural network language model | |
CN113806531A (en) | Drug relationship classification model construction method, drug relationship classification method and system | |
CN108763192B (en) | Entity relation extraction method and device for text processing | |
CN111859887A (en) | Scientific and technological news automatic writing system based on deep learning | |
CN112733547A (en) | Chinese question semantic understanding method by utilizing semantic dependency analysis | |
CN115712700A (en) | Hot word extraction method, system, computer device and storage medium | |
CN115455202A (en) | Emergency event affair map construction method | |
Khan et al. | Urdu word segmentation using machine learning approaches | |
Da et al. | Deep learning based dual encoder retrieval model for citation recommendation | |
CN114579695A (en) | Event extraction method, device, equipment and storage medium | |
CN113761128A (en) | Event key information extraction method combining domain synonym dictionary and pattern matching |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20210723 Address after: 450000 C15-2, 50, Wutong street, Zhengzhou new hi tech Industrial Development Zone, Henan Applicant after: Zhengzhou chaos Information Technology Co.,Ltd. Address before: No.3511, 1st floor, building 1, No.1, Anhua street, Konggang street, Shunyi District, Beijing Applicant before: Beijing Beidou Tianxun Technology Co., Ltd |
|
TA01 | Transfer of patent application right |