CN108563620A - The automatic writing method of text and system - Google Patents
The automatic writing method of text and system Download PDFInfo
- Publication number
- CN108563620A CN108563620A CN201810331488.4A CN201810331488A CN108563620A CN 108563620 A CN108563620 A CN 108563620A CN 201810331488 A CN201810331488 A CN 201810331488A CN 108563620 A CN108563620 A CN 108563620A
- Authority
- CN
- China
- Prior art keywords
- text
- information
- data
- classification
- writing method
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The present invention provides a kind of automatic writing methods of text, include the following steps:Process is presented in information gathering process, text resolution process, content generating process and product.The reader conduct analytic process, including:Reader's behavior information is obtained from one or more of internet platforms, analyzes the reader's behavior information, wherein described information gatherer process, the text resolution process and the content generating process is adjusted according to the reader's behavior information.
Description
Technical field
The invention mainly relates to computer realm more particularly to a kind of automatic writing method of text and systems.
Background technology
Along with the high speed development of internet, more and more first-hand information are issued by network.These information
Type is abundant, substantial amounts, the form of expression are various.For content originator, especially media worker, it is expected that supervising in time
The information of magnanimity is controlled and obtains, a large amount of writing materials of coming are collected in effectively management through various channels, efficiently, rapidly to these
Material is screened, is handled and carried out contents production.
The automatic writing method of some texts has been proposed, the major part of these methods is to be based on structured message.Structure
Change information and be decomposed into multiple inter-related component parts after analysis, has specific hierarchical structure between each component part,
Its operation and maintenance is managed by database, and has certain working specification.In contrast, permitted in non-structured information
More contents are all unpredictable.It is write automatically according to non-structured information, is a huge challenge.
Invention content
The technical problem to be solved in the present invention is to provide a kind of automatic writing method of text and systems, contribute to according to non-knot
The information of structure is write automatically.
In order to solve the above technical problems, the present invention provides a kind of automatic writing method of text, include the following steps:Information
Gatherer process, including:Information is acquired from internet, format conversion is carried out to described information, noise cleaning is carried out to described information,
Data primary dcreening operation is carried out to described information, text is obtained, wherein the text includes unstructured part;Text resolution process, packet
It includes:Classify to the text, the name entity in the text is identified according to the classification of the text, according to the text
Classification extract in the text name entity between entity relationship, according to the classification of the text extraction can reflect institute
State the event morpheme of the event in text;Content generating process, including:One or more writing scenes are pre-configured with, are matched in advance
One or more logic templates are set, according to the name entity, the entity relationship and event morpheme and apply the writing field
Scape and logic template generate paragraph, identify associated paragraph and aggregate into article;Process is presented in product, including:By the article
It is distributed to one or more internet platform;Reader conduct analytic process, including:It is flat from one or more of internets
Platform obtains reader's behavior information, analyzes the reader's behavior information, wherein described information gatherer process, the text resolution process
It is adjusted according to the reader's behavior information with the content generating process.
In one embodiment of this invention, the text resolution process further includes:It extracts and refines in advance in the text
Keyword.
In one embodiment of this invention, the text resolution process further includes:Extract the key message in the text.
In one embodiment of this invention, the text resolution process further includes:It extracts in the text for constituting text
The sentence of shelves abstract.
In one embodiment of this invention, the text resolution process further includes:Analyze the feeling polarities of the text.
In one embodiment of this invention, the above method further includes numerical analysis process, and the data analysis process includes:
Numerical computations and statistics are carried out to the data in the text, monitor whether the data in the text exceptional value occur.
In one embodiment of this invention, the step of classifying to the text include according to the classification that pre-establishes into
Row classification, wherein pre-establishing class method for distinguishing and including:Obtain the classification of one or more settings;By the of multiple training texts
A part is referred in one or more of classifications;One or more of classifications will can not be referred in multiple training texts
In second part be divided into one or more clusters;Receive the tag along sort of the foundation to one or more of clusters.
In one embodiment of this invention, each logic template in one or more of logic templates include one or
Multiple candidate sentences, each candidate sentences include one or more candidate name entities, morpheme and clause.
In one embodiment of this invention, according to the name entity, the entity relationship and event morpheme and using institute
Stating the step of writing scene and logic template generation article includes:It is automatically generated according to the parameter of input using deep learning method
Paragraph, the paragraph are received in the logic template.
The present invention also proposes a kind of automatic authoring system of text, including memory, can be executed by processor for storing
Instruction;Processor, for executing described instruction to realize method as described above.
The text automatic generation method of the embodiment of the present invention and system combination information collection, data analysis, text editing,
The key modules such as content publication, data backflow may be implemented automatic writing integrated process, improve the efficiency of contents production
And timeliness.
Description of the drawings
Fig. 1 is the schematic diagram of the automatic authoring system of text according to an embodiment of the invention.
Fig. 2 is the schematic diagram of the automatic authoring system of text according to another embodiment of the present invention.
Fig. 3 is the schematic diagram of the automatic writing method of text according to an embodiment of the invention.
Fig. 4 is information collection schematic diagram according to an embodiment of the invention.
Fig. 5 is text classification schematic diagram according to an embodiment of the invention.
Fig. 6 is the cluster arborescence of unknown classification bulletin according to an embodiment of the invention.
Fig. 7 is text categories system example according to an embodiment of the invention.
Fig. 8 is name Entity recognition example according to an embodiment of the invention.
Fig. 9 is keyword extraction result example according to an embodiment of the invention.
Figure 10 is event extraction result example according to an embodiment of the invention.
Figure 11 is target critical Examples of information according to an embodiment of the invention.
Specific implementation mode
For the above objects, features and advantages of the present invention can be clearer and more comprehensible, below in conjunction with attached drawing to the tool of the present invention
Body embodiment elaborates.
Many details are elaborated in the following description to facilitate a thorough understanding of the present invention, still the present invention can be with
Implemented different from other manner described here using other, therefore the present invention do not limited by following public specific embodiment
System.
As shown in the application and claims, unless context clearly prompts exceptional situation, " one ", "one", " one
The words such as kind " and/or "the" not refer in particular to odd number, may also comprise plural number.It is, in general, that term " comprising " only prompts to wrap with "comprising"
Include clearly identify the step of and element, and these steps and element do not constitute one it is exclusive enumerate, method or equipment
The step of may also including other or element.
The embodiment of the present invention describes the automatic writing method of text and system.This contributes to according to non-structured information
To be write automatically.
Fig. 1 is the block diagram of the automatic authoring system of text of one embodiment of the invention.Refering to what is shown in Fig. 1, text is write automatically
System 100 may include internal communication bus 101, processor (processor) 102, read-only memory (ROM) 103, arbitrary access
Memory (RAM) 104, communication port 105, input output assembly 106, hard disk 107 and user interface 108.Intercommunication is total
The data communication of 100 inter-module of computer may be implemented in line 101.Processor 102 can be judged and be sent out prompt.At some
In embodiment, processor 102 can be made of one or more processors.Computer 100 and its may be implemented in communication port 105
Into row data communication between his component (not shown).In some embodiments, computer 100 can pass through communication port
105 send from network and receive information and data.Input output assembly 106 is supported defeated between computer 100 and other component
Enter/output stream.The interaction between computer 100 and user and information exchange may be implemented in user interface 108.Computer
100 can also include various forms of program storage units and data storage element, such as hard disk 107, read-only memory
(ROM) 103 and random access memory (RAM) 104, computer disposal can be stored and/or various data text that communication uses
Possible program instruction performed by part and processor 102.
As an example, input output assembly 106 may include the one or more of component below:Mouse, trace ball,
Keyboard, touch control component, sound receiver etc..
For example, the automatic writing method of the text of the application may be embodied as computer program, be stored in hard disk 107
In, and can be recorded in processor 102 and execute, to implement the present processes.
It is appreciated that the automatic authoring system of the text of the application be not limited to it is computer-implemented by one, but can be by
Multiple online collaborative computers are implemented.Online computer can be connected by LAN or wide area network and communication.
Such as the automatic authoring system of text of the embodiment of the present invention can be that text writes software automatically, be stored in hard disk
In.
When the automatic authoring system of text is embodied as software, it can also store in a computer-readable storage medium as system
Product.For example, computer readable storage medium can include but is not limited to magnetic storage apparatus (for example, hard disk, floppy disk, magnetic stripe), light
Disk (for example, compact disk (CD), digital versatile disc (DVD)), smart card and flash memory device are (for example, electrically erasable is only
Read memory (EPROM), card, stick, key driving).In addition, various storage media described herein can be represented for storing information
One or more equipment and/or other machine readable medias.Term " machine readable media " can include but is not limited to store,
Including and/or the carrying code and/or wireless channel and various other media (and/or storage medium) of instruction and/or data.
A example safety message of the embodiment of the present invention, which reports tracking system, can also be embodied as software service (Software
As a Service) form.Fig. 2 is the block diagram of the automatic authoring system of text of another embodiment of the present invention.With reference to 2 institute of figure
Show, system may include that client computer 210 and server 220, the two are connected by network 210.Network 210 known can have with various
Line or wireless network, it is not reinflated herein.Server 220 and the cooperation of client computer 210 are to realize described in previous embodiment
Method or its change case.User interface, communication port and input module can be equipped in client computer 210.User interface can be to
Various interfaces are presented in user, and input module can receive the input of user.Communication port can be configured in server 220
(not shown), memory 221 and processor (not shown), memory 221 store computer instruction, and processor executes these instructions
With the major part of implementation method.The result of processor processing is transmitted to client computer 210 by communication port, in client computer 210
It is shown in user interface.
It is appreciated that the automatic authoring system of the text of the application is not limited to by a server implementation, but can be by
Multiple online server coordinated implementations.Online server can be connected by LAN or wide area network and communication.
It should be understood that embodiments described above is only signal.Embodiment described herein can be in hardware, software, solid
It is realized in part, middleware, microcode or its arbitrary combination.For hardware realization, processing unit can be in one or more spy
Determine purposes integrated circuit (ASIC), digital signal processor (DSP), digital signal processing appts (DSPD), programmable logic device
It part (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor and/or is designed as executing
Other electronic units of function described herein or its combine interior realization.
Fig. 3 is the schematic diagram of the automatic writing method of text according to an embodiment of the invention.The method of the present embodiment can be with
Implement in Fig. 1, Fig. 2 or the automatic authoring system of text of its variation.Refering to what is shown in Fig. 3, the text of the present embodiment is write automatically
It may include that 340 and of process is presented in information gathering process 310, text resolution process 320, content generating process 330, product as method
Reader conduct analytic process 350.Optionally, the automatic writing method of text may include diagnosis process 360.
Information gathering process 310 may include the step 311 that information is acquired from internet, and the step of format conversion is carried out to information
Rapid 312, the step 313 of noise cleaning is carried out to information, and the step 314 of data primary dcreening operation is carried out to information.This step can obtain
Text, text may include unstructured part.Certainly, text may also comprise structure part and/or semi-structured part.Below
Processing be notably directed to unstructured part.
Text resolution process 320 may include:To the step 321 that text is classified, text is identified according to the classification of text
In name entity step 322, according to the classification of text extract text in name entity between entity relationship the step of
324, and extract according to the classification of text the step 325 for the event morpheme that can reflect event in text.Optionally, text
Resolving 320 may also include:The step 323 of the keyword refined in advance in extraction text extracts in text for constituting text
The step 326 of the sentence of shelves abstract extracts the step 327 of the key message in text, and analyze the feeling polarities of text
Step 328.
Content generating process 330 may include:The step of step 331 that paragraph generates, the identification of association paragraph, selection and combination
332 and contribution generate step 333.Here, the step of being pre-configured with one or more writing scenes, and it is pre-configured with one
A or multiple logic templates, paragraph generate step 331 in, according to text resolution process 320 obtains name entity, the reality
Body relationship and event morpheme, Applied Composition scene and logic template generate paragraph.
Product is presented process 340 and may include:Article is distributed to the step 341 of one or more internet platform.
Reader conduct analytic process 350 may include the step that reader's behavior information is obtained from one or more internet platform
Rapid 351, and analyze the step 352 of reader's behavior information.350 obtained information of reader conduct analytic process, which can enter, examines
Disconnected process 360.
Diagnosis process 360 may include according to reader's behavior information, adjustment information gatherer process 310, text resolution process 320
With content generating process 330.According to reader's behavior information, for distribution platform and content originator in selection article, modification
It is referred to when appearance.
It is alternatively possible to after content generating process, artificial content auditing, amendment step 334 is added.Diagnosis process
360 can collect the feedback from step 334 in step 361, and the error statistics of each process 310-340 are carried out by diagnosing
And error analysis, help system are constantly iterated according to actual conditions, optimize, to further increase system effectiveness.
In the present embodiment, the data that information gathering process 310 are obtained can be put into the original contents library of database 30
32.Text resolution process 320 can use the domain knowledge base 33 of database 30, the data obtained that can be put into database
30 original contents library 31.Content generating process 310 can use writing scene and the logic template library 34 of database 30, and institute
The data of acquisition can be put into the machine contribution library 35 of database 30.
The following detailed description of each process.
Information collection
Distributed reptile
Fig. 4 is information acquisition system schematic diagram according to an embodiment of the invention.Refering to what is shown in Fig. 4, in order to obtain magnanimity
Internet on data, including all kinds of websites and social platform, it is possible to provide a distributed reptile system.Distributed reptile system
System ensure that the dynamic scalable of two dimensions from framework, be performance dimension and data source dimension respectively.For this purpose, reptile framework
On decoupled two big modules, be console module (including central scheduler 41 and insert receptacle 42) and card module 43 respectively.It climbs
The console module of worm is to ensure the dynamic property extension of the system, main offer reptile central scheduler 41 and abstract hardware money
Source does not include any service logic.Reptile task 44 can be assigned to each reptile service to load balancing by central scheduler 41
It is executed on device 42.As long as adding crawler server, you can the handling capacity of horse back linear expansion crawler system.Card module 43 be for
Ensure the crawler system data source level dynamic scalable.Since each website crawls, logic is different, data structure is different,
It can not unify, so each data source crawls logical cohesion in each card module 43.Card module 43 can be flat in reptile
Hot plug executes on platform.After Platform deployment, as long as exploitation plug-in unit, you can linear expansion reptile data source.
Here, the information source of acquisition can be authoritative news website, information announcement channel, social media, structural data
Interface etc..Collected content is put into original contents library.
Format analysis processing and noise cleaning
Since the information source category of acquisition is abundant, especially format used in unstructured data also differs widely, institute
To need to carry out preliminary treatment to the data that acquisition obtains.Format analysis processing technology in the present embodiment mainly turns including PDF format
Change technology and HTML cleaning techniques.
PDF format switch technology is mainly used for the pdf document that will be got, and is converted to html format file.Citing comes
It says, in field of finance and economics, the bulletin that major listed company is issued is PDF format, wherein including a variety of lattice such as word, chart
The important information of formula.In order to which the data in these bulletins are extracted and handled, the present embodiment is first after carrying out data acquisition
Pdf document is first converted into html format.The technical characterstic is accuracy rate height, and can be in stet shelves chart class
Information does not cause data to omit and lack.
HTML cleaning techniques are mainly used for cleaning the web data of acquisition, only retain Web page text, screen out webpage
In navigation, advertisement, video etc. " noise ".For most of webpages, general Web Page Cleaning Technology can be used, is obtained wherein
Text.For the more complex webpage of part-structure, it can targetedly preset and use cleaning rule, to ensure Web Cleanout
Coverage and accuracy rate.
Data primary dcreening operation
Data primary dcreening operation refers in data acquisition, when the domain name of comprehensive consideration webpage, place column, title, publication
Between etc. information, tentatively filter out non-targeted data, only retain the target data in specify information source.Data acquisition is mainly from major
Website obtains data, can have various forms of external linkages in each website.In gatherer process, target data is got
Meanwhile have certain probability that can be acquired to these out-link web pages, the purpose of data primary dcreening operation technology be by these data into
Row filtering.
Text resolution
Text classification
After getting text data, it is necessary first to which text is sub-divided into specific category;Further according to different classes of text
Feature and demand carry out subsequent information extraction and parsing.It can be seen that text classification is very basic and vital
Step, effect quality will have a direct impact on the progress of subsequent step.
When the text data of acquisition is related to that multiple fields, source is various, type is various, format subject matter differs, content is complicated
When, text categorization task is more challenging.It in one embodiment, can be by by artificial experience and machine learning side
Method is combined, and is established taxonomic hierarchies and is realized automatic classification.
By taking " listed company's bulletin " this class text as an example, existing about 1000 listed companies, the bulletin quantity issued daily
It it is thousands of, the peak time bulletin amount of publication is up to 4000/day, 350,000 average/annual, and wide variety, content
Complexity is high.If according to the experience of field of finance and economics editor and reporter, bulletin can be divided into more than 90 classifications.But through and pumping
Sample and artificial mark, it is found that only about 40% bulletin can be accurately classified as this more than 90 class, remaining about 60% bulletin can not correspond to
To specific classification.
An embodiment according to the present invention, proposes file classification method as shown in Figure 5.According to this method, obtain first
Take one or more classifications set.Then such as step 52, when the first part's text for the text document 51 for judging multiple training
It has criteria for classification, then this first part's text is referred in the one or more classifications set.Judge when in step 53
The second part text of the text document 51 of multiple training can not be referred in the one or more classifications set, then in step
54, the second part text of multiple training texts is divided into one or more clusters.The form of cluster is as shown in 55.In step 56
It may determine that whether cluster is important.The mode of judgement can be that artificial judgment or machine judge.In step 57, receive to important
The tag along sort established of cluster, and together with the classification of existing setting, form new taxonomic hierarchies 58.
It filters out important announcement in order to comprehensive, effective and establishes an effective criteria for classification, for above-mentioned " unknown
Classification is announced ", in one embodiment, the analysis such as Fig. 4 is carried out to it by hierarchical cluster.By dividing cluster result
Analysis, class number of clusters k=22 are relatively reasonable values.This 22 class clusters are manually spot-check, its generic are carried out general
It includes.
Bulletin in 22 classifications is carried out after manually spot-check, summarize and judging, using the high classification of importance as individually
Classification be added to bulletin taxonomic hierarchies in, the low classification of importance is uniformly classified as ' others ' class, thus i.e. can determine most
The criteria for classification announced eventually.Fig. 7 is text categories system example according to an embodiment of the invention.
It, can be in conjunction with the 33 (reference chart of domain knowledge base established according to domain knowledge according to taxonomic hierarchies established above
3), characterized by source, webpage original tag, title, content of text, key feature word etc., machine learning and regular phase are utilized
In conjunction with to model training and optimization, to realize the automatic classification of text.
Name Entity recognition
The purpose of name entity identification algorithms, which is name, place name, the organization's title etc. identified in sentence, to be made
It is managed for entity or associated vocabulary.Can often occur a large amount of names, place name and organization's title etc. in each class text
Entity information, and this type of information often plays an important role to the identification of text, classification and information extraction.
In one embodiment, be based on hidden Markov model (HMM), in conjunction with the magnanimity politics of collection, economic, science and technology,
The entity informations such as name, place name, government organs and listed company's title in the fields such as culture can obtain the higher life of accuracy rate
Name entity recognition model, to realize identification and the mark to entity information in text.Fig. 8 is according to an embodiment of the invention
Entity recognition example is named, as shown in figure 8, in this example, " Wang Shi " is identified as name, " Vanke Co., Ltd "
It is identified as organization, and " Shenzhen " and " Liuzhou " is then classified as place name.
Keyword extraction
The aiming at of keyword extraction techniques is chosen several representative vocabulary in article and is prompted in full text
Thought is thought.During handling mass text, sum up every article keyword can not only assist user to text into
Row fast understanding, additionally it is possible to the efficiency of retrieval, management, reading be greatly improved, found for article subject extraction, hot word, document
The work important in inhibiting such as automated tag and information index.
In one embodiment of this invention, based on the text for carrying artificial keyword more than 120,000, with part of speech, there is position
Set, TF-IDF, Text-Rank score, Word2Vec vectors etc. are characterized, trained the keyword extraction mould of high-accuracy
Type.By actual test, the coverage rate of the model is generally higher than presently disclosed keyword extraction tool.Fig. 9 is according to this hair
The keyword extraction result example of a bright embodiment.As shown in figure 9, this section is about permanent short-term propagation of the life insurance greatly in A share market
The text for causing supervision to be paid close attention to recommends " insurance capital ", " supervision ", " permanent big ", " propagation " by keyword extraction algorithm
Four keywords have preferable suggesting effect to the important information of this article.
Entity relation extraction
Entity relationship extraction refers to, after the name entity in text is identified, further confirms that between these entities
The type of relationship, wherein entity relationship is pre-defined.
For example, in text " ... [Vanke | ORG] founder [Wang Shi | PER] ... ", " Vanke ",
" Wang Shi " is name entity, constitutes subordinate relation (Org-Aff.Founder) again between the two.The reality of entity relation extraction
It is existing, it handles and retrieves for mass text, numerous natures such as knowledge base is built automatically, textual association, machine translation and documentation summary
Language processing tasks provide important technical support.
In one embodiment of this invention, it is kind with limited high quality mark document by the method for semi-supervised learning
Son trains high-precision condition random characterized by entity text, type, context, syntax tree distance, special clause etc.
Field (CRF) model, carries out the entity relation extraction of text.During model repetitive exercise, output result passes through rule
Judged, further increases accuracy rate, and be constantly trained high confidence results input model.
Event extraction
The effect of Event Extraction be in text with the shape of the structuring of the event standard of natural language expressing
Formula redefines.Especially in news class text, correctly identifies and extract the event that occurs in text for from semantic
Angle, which understands content of text and carries out more deep text mining, vital effect.
In one embodiment of this invention, Event Extraction, including pretreatment (participle, subordinate sentence, interdependent syntactic analysis,
Entity recognition, relation recognition etc.), trigger word identification, candidate events sentence identification, event sentence judgement, event type judgement, event member
All multi-steps such as element identification, use based on syntax, pattern match, a variety of methods of machine learning in different step.Figure 10 is
Event extraction result example according to an embodiment of the invention, from text shown in Figure 10, the present embodiment, which has identified, " to be received
This event of purchase ", and corresponding purchaser, time buying, purchase object, concluded price.
Extract documentation summary
The documentation summary of high quality can greatly improve the efficiency for reading text, make user or reader's fast understanding text
Content, and judge the use of text, researching value.Abstract itself can also be used for the mistake of content creation as the material of high quality
Cheng Zhong.
In one embodiment of this invention, the file summarization method based on Text-Rank and machine learning is realized, and
Two methods achieve good effect.Entire chapter text is considered as a network, in text by the method based on Text-Rank
Each sentence be considered as the node in network, gone out between sentence after correlation, i.e., according to feature calculations such as meaning of a word distance, semantic distances
The importance score of each sentence can be calculated according to Page-Rank methods, the higher sentence of score is more possible to appear in pluck
In wanting.Method based on machine learning, then by part of speech, term vector, name entity, with the correlation of title etc. characterized by, judgement
Whether sentence should appear in abstract.
Key message extracts
Information extraction is one of the important step during contribution generates, and is only carried out targetedly, accurately to key message
Extraction, convert unstructured data to structural data, can utilize obtain information carry out subsequent analysis, such as information
Association, content summary etc., to generate, data are accurate, content is reliable, informative contribution.According to the present invention one implements
Example, the method being combined with machine learning using rule carry out key message extraction to the text that acquisition obtains.
By taking " listing announcement " as an example, name entity (place name, company name, name therein need to be targetedly extracted
Deng), temporal information, stock information (stock code, capital stock, issue price etc.), company information (registered capital, main business etc.)
Etc. critical datas (such as Figure 11).In extraction process, the data that format is fixed, accuracy requirement is high, such as stock code, stock
Sheet, registered capital etc., DT original texts king are mainly extracted using rule-based method, to ensure the correctness of data.Statement is more
Sample, the data without set form, such as name entity, temporal information, then mainly using machine learning method be identified and
Extraction.
Feeling polarities are analyzed
Feeling polarities analytical technology is applied to judge the emotional color of text sentence, paragraph, chapter.To in text
Sentence and chapter carry out sentiment analysis, and the subjective attitude for concluding the viewpoint and author that include in text, result is contributed to can be used for
The scenes such as text retrieval, calculation of relationship degree, content-aggregated, commending contents.
An embodiment according to the present invention, sentiment analysis are considered as more classification problems to text sentence and chapter, i.e., will be defeated
The text entered is classified as passive (derogatory sense), actively (commendation), neutral one kind.During emotional semantic classification, the evaluation that occurs in text
Word and combination evaluation unit, word position feature, n-gram word features, part of speech feature, upper and lower sentence emotional category etc. can quilts
Consider, is used for training machine learning model.Evaluates word and combination evaluation unit can also be used for the foundation of rule, in conjunction with
Machine learning model as a result, obtaining final emotional semantic classification result.
Data analysis the relevant technologies
Common numerical computations and statistical method
After obtaining structural data by data acquisition or text resolution, also needs to analyze these data, calculate,
Valuable result could be obtained.For example, field of finance and economics must often calculate on year-on-year basis/ring than amount of increase and amount of decrease, sports field need to often calculate not
With the utilization rate and success rate of technology, electric business field need to often calculate top search term, hot item and sales volume trend etc..
In one embodiment of this invention, efficient, targetedly data can be carried out according to the demand under different scenes
Analysis.Preferably, while obtaining data, with regard to being analyzed in real time, analysis result is used for content and generates in real time, it is ensured that entire
The high-timeliness of flow.
Anomaly
It under special screne, when monitoring data in real time, needs to find exceptional value, and generates in time corresponding
Content reported, such as:Situations such as commodity transaction amount is abnormal, stock market data rises suddenly and sharply/slumps, seismic monitoring data exception.
In one embodiment of this invention, hypothesis can be utilized by being modeled to normal value according to domain-specific knowledge
A variety of methods such as inspection, mode discovery, machine learning realize exceptional value discovery technique for different scenes;And not Tongfang
It can be compared to each other and be verified between method, improve credible result degree, reduce rate of false alarm.By the uninterrupted monitoring to data,
The cost manually monitored can be greatly reduced, and avoids losing situations such as reporting, failing to report caused by human factor.
Content generates
Write scene and logic template
Under different field, type, subject matter, there can be many writing scenes, different writing scenes is corresponding with different
The many aspects such as writing demand, including required data, common words, common style, article length, writing logic, wherein and with
It is the most key and important to write logic.In order to structurally store different writing scenes and its corresponding a variety of writing demands.
Each category feature can be used to carry out qualitative description to writing scene for an embodiment according to the present invention.According to the present invention one implements
Example is also pre-configured with a set of readable, reusable, can share, change and safeguard simple frame for describing different writing demands.
In the context of this application, which is referred to as " logic template ".
Logic template to write logic as core, further include in logic template required data, word preference, length limitation,
The information such as feeling polarities preference.Logic template is using sentence as basic structure.Each logic template may include one or more times
It includes one or more candidate name entities, morpheme and clause to select sentence, each candidate sentences.More specifically, each sentence
In, all include the expression (including entity, phrase, word, clause etc.) that can be replaced.These replace expression by special symbol
It number is marked.Sentence is filled using different entities, phrase or word, it will changes expression and the semanteme of sentence, therefore, i.e.,
Make to be based on a logic template, can also create and express more various contributions.The complexity of each logic template, depends on
In the levels of precision and complexity of writing logic.Write that logic is more accurate, more complicated (such as financial report, data analysis report
Deng), then logic template is more complicated;Logic template is more brief if (otherwise such as news summary etc.).
One logic template represents under corresponding writing scene, a kind of style of writing thinking that may be used, a kind of writing field
Scape can possess multiple logic templates.Logic template is the concrete embodiment to writing logic and demand, is to Writing Experience and to know
That knows is embodied so that abstract empirical conversion is specific word, and can read, change and cross-platform sharing.This is right
Content creation transmission of knowledge, study, improvement suffer from huge meaning and value.
In practical applications, when by information collection, text resolution, data analysis, getting a logic mould
After all data needed for plate, one embodiment of the invention will automatically select the expression for meeting feeling polarities (such as sports field
Win completely, win by a narrow margin, lose the game regretfully, defeat), generate corresponding article automatically according to the logic template.
Deep learning
An embodiment according to the present invention, deep learning method are used to automatically generate one section of text according to the parameter of input,
This section of text is received in logic template as a literary section, finally becomes a part for entire article.It is to generate descriptive labelling
Example, when specified commodity are clothing, the parameter of input includes type, color, clothing is long, is suitble to crowd, Time To Market, left front, material
The dozens ofs feature such as matter, the place of production, the common people, style, brand, price.
The step of article being generated according to name entity, entity relationship and event morpheme and Applied Composition scene and logic template
It may include:Paragraph is automatically generated according to the parameter of input using deep learning method.This paragraph can be received in logic template.
Based on constantly experiment and model iteration result, in order to ensure the continuity and correctness of text, deep learning side
Method is primarily used to generate shorter text fragments.
Content is associated with and polymerization
When carrying out content generation, it usually needs polymerization multiclass information forms the article of a completion.Such as in analysis macroscopic view
When economic data, the master data announced according to official is needed, is associated with the analysis result of statistician and the phase of domain expert
Close comment.Often for covering surface compared with wide, content is various, can not only analyze causes the factor of data movement (for example to be divided for the analysis of expert
Food, service, education are referred to when analysing CPI), it can also analyze influence of the data variation to Macroeconomic Control Policy.Therefore, common text
This Similarity Algorithm in such a scenario and is not suitable for.An embodiment according to the present invention, by establishing the relevant knowledge in field
The entity referred in text, event, feeling polarities are compared in collection of illustrative plates, analysis, calculate the correlation degree between text.
With CPI data instances, after official announces basic data, the embodiment of the present invention will be to collected expert view
Text resolution is carried out, judges whether the content in expert view is related to CPI, whether meets official's data, feeling polarities
Meet data variation etc..Unrelated with theme or viewpoint there are mistake will be screened out, will be into traveling one in remaining expert view
Article content is added as literary section with after automatic select in step sequence.
Contribution is distributed
Intelligence writing platform is supported and the docking of third party's data platform, can be in time efficiently completed with help content creator
The publication of author content.By individual cultivation, contribution can be transmitted to platforms such as microblogging, wechat, enterprise CMS, and the technology is main
It is realized by data-interface.
Reader conduct is analyzed
In this step, click, the reading information of the article of platform publication can be obtained, such as:User's amount of reading is read
Read duration, article reprints number, like time, comment number, comments on content and the essential information (age, occupation) etc. of reader.
According to these data, in conjunction with information such as the themes, keyword, feeling polarities of article, it can be drawn a portrait by user and big data is divided
Analysis etc. technologies, analyze different topics and article all ages and classes, gender, occupation, region reader in preference degree.
Diagnosis
The input of diagnosis is two category informations:1. editor is when carrying out content auditing and modification, the mistake of discovery and to original text
The modification of part operates;2. the analysis result of reader conduct analysis module.According to type I information, we can be to each of system
Step carries out error statistics and error analysis, help system are constantly iterated according to actual conditions, optimize, to further carry
High system effectiveness.According to the second category information, joined when selecting article, modification content for distribution platform and content originator
It examines.
The text automatic generation method of the embodiment of the present invention and system combination information collection, data analysis, text editing,
The key modules such as content publication, data backflow may be implemented automatic writing integrated process, improve the efficiency of contents production
And timeliness.
The text automatic generation method and system of the embodiment of the present invention being capable of automatic collection fields such as finance and economics, electric business
Hundreds of information sources cover many authoritative information publishers such as each ministries and commissions of country, secondary market, expert's social media account;It is right
It, can each speech like sound of dynamic generation (such as Chinese, English) after wherein structuring, unstructured (text) data carry out analyzing processing
Contribution.
In the above-mentioned methods, information collection, text resolution, content generating portion multiple technologies and step, can be according to tool
Body demand is omitted, increased or is replaced.In actual conditions, the keyword, abstract, feeling polarities of text need not be obtained
Etc. information, then corresponding steps can be omitted;If desired the data other than the information that the above method is extracted are obtained, can also be increased
Add corresponding text resolution module, such as hot spot is found, subject distillation;Equally, different technological means can also be selected, is reached
Same parsing purpose, such as rule is replaced using machine learning model.
Although the present invention is described with reference to current specific embodiment, those of ordinary skill in the art
It should be appreciated that above embodiment is intended merely to illustrate the present invention, can also make in the case of no disengaging spirit of that invention
Go out various equivalent change or replacement, therefore, as long as to the variation of above-described embodiment, change in the spirit of the present invention
Type will all be fallen in the range of following claims.
Claims (10)
1. a kind of automatic writing method of text, includes the following steps:
Information gathering process, including:Information is acquired from internet, format conversion is carried out to described information, described information is carried out
Noise cleans, and carries out data primary dcreening operation to described information, text is obtained, wherein the text includes unstructured part;
Text resolution process, including:Classify to the text, the life in the text is identified according to the classification of the text
Name entity extracts the entity relationship between the name entity in the text, according to the text according to the classification of the text
Classification extract the event morpheme that can reflect event in the text;
Content generating process, including:One or more writing scenes are pre-configured with, one or more logic templates are pre-configured with,
Paragraph is generated according to the name entity, the entity relationship and event morpheme and the application writing scene and logic template,
It identifies associated paragraph and aggregates into article;
Process is presented in product, including:The article is distributed to one or more internet platform;
Reader conduct analytic process, including:Reader's behavior information is obtained from one or more of internet platforms, analyzes institute
Reader's behavior information is stated,
Wherein described information gatherer process, the text resolution process and the content generating process are believed according to the reader conduct
Breath is adjusted.
2. the automatic writing method of text according to claim 1, which is characterized in that the text resolution process further includes:
Extract the keyword refined in advance in the text.
3. the automatic writing method of text according to claim 1, which is characterized in that the text resolution process further includes:
Extract the key message in the text.
4. the automatic writing method of text according to claim 1, which is characterized in that the text resolution process further includes:
Extract the sentence for constituting documentation summary in the text.
5. the automatic writing method of text according to claim 1, which is characterized in that the text resolution process further includes:
Analyze the feeling polarities of the text.
6. the automatic writing method of text according to claim 1, which is characterized in that further include numerical analysis process, it is described
Data analysis process includes:Carrying out numerical computations and statistics, the data monitored in the text to the data in the text is
It is no exceptional value occur.
7. the automatic writing method of text according to claim 1, which is characterized in that the step of classifying to the text
Include being classified according to the classification that pre-establishes, wherein pre-establishing class method for distinguishing and including:
Obtain the classification of one or more settings;
The first part of multiple training texts is referred in one or more of classifications;
The second part that can not be referred in one or more of classifications in multiple training texts is divided into one or more poly-
Class;
Receive the tag along sort of the foundation to one or more of clusters.
8. the automatic writing method of text according to claim 1, which is characterized in that in one or more of logic templates
Each logic template include one or more candidate sentences, each candidate sentences include it is one or more it is candidate name entities,
Morpheme and clause.
9. the automatic writing method of text according to claim 1, which is characterized in that according to the name entity, the reality
Body relationship and event morpheme and application the writing scene and logic template generation article the step of include:Use deep learning side
Method automatically generates paragraph according to the parameter of input, and the paragraph is received in the logic template.
10. a kind of automatic authoring system of text, including:
Memory, for storing the instruction that can be executed by processor;
Processor, for executing described instruction to realize such as claim 1-9 any one of them methods.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810331488.4A CN108563620A (en) | 2018-04-13 | 2018-04-13 | The automatic writing method of text and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810331488.4A CN108563620A (en) | 2018-04-13 | 2018-04-13 | The automatic writing method of text and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108563620A true CN108563620A (en) | 2018-09-21 |
Family
ID=63534917
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810331488.4A Pending CN108563620A (en) | 2018-04-13 | 2018-04-13 | The automatic writing method of text and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108563620A (en) |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109284298A (en) * | 2018-11-09 | 2019-01-29 | 上海晏鼠计算机技术股份有限公司 | A kind of contents production system handled based on machine learning and big data |
CN109492112A (en) * | 2018-10-24 | 2019-03-19 | 北京百科康讯科技有限公司 | A kind of method of the computer aided writing scientific popular article of knowledge based map |
CN109584013A (en) * | 2018-11-30 | 2019-04-05 | 北京字节跳动网络技术有限公司 | The method and apparatus for generating article description information |
CN109584012A (en) * | 2018-11-30 | 2019-04-05 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating article pushed information |
CN109597894A (en) * | 2018-09-30 | 2019-04-09 | 阿里巴巴集团控股有限公司 | A kind of correlation model generation method and device, a kind of data correlation method and device |
CN109635260A (en) * | 2018-11-09 | 2019-04-16 | 北京百度网讯科技有限公司 | For generating the method, apparatus, equipment and storage medium of article template |
CN109885821A (en) * | 2019-03-05 | 2019-06-14 | 中国联合网络通信集团有限公司 | Article writting method and device, computer storage medium based on artificial intelligence |
CN110046672A (en) * | 2019-04-24 | 2019-07-23 | 哈尔滨工程大学 | A kind of determining method of bank electronic channel exception transaction based on semi-supervised learning |
CN110059307A (en) * | 2019-04-15 | 2019-07-26 | 百度在线网络技术(北京)有限公司 | Writing method, device and server |
CN110688857A (en) * | 2019-10-08 | 2020-01-14 | 北京金山数字娱乐科技有限公司 | Article generation method and device |
CN110705310A (en) * | 2019-09-20 | 2020-01-17 | 北京金山数字娱乐科技有限公司 | Article generation method and device |
CN110765742A (en) * | 2019-09-10 | 2020-02-07 | 上海融盈数据科技有限公司 | Automatic manuscript writing system based on text analysis technology |
CN110765753A (en) * | 2019-12-27 | 2020-02-07 | 广东博智林机器人有限公司 | Method, system, computer device and storage medium for generating file |
CN110874313A (en) * | 2019-11-18 | 2020-03-10 | 北京百度网讯科技有限公司 | Writing tool testing method and device |
CN111309866A (en) * | 2020-02-15 | 2020-06-19 | 深圳前海黑顿科技有限公司 | System and method for intelligently retrieving written materials by utilizing semantic fuzzy search |
CN111859887A (en) * | 2020-07-21 | 2020-10-30 | 北京北斗天巡科技有限公司 | Scientific and technological news automatic writing system based on deep learning |
CN112446212A (en) * | 2019-08-29 | 2021-03-05 | 北京易车互联信息技术有限公司 | Article generation method and device, electronic equipment and storage medium |
CN112667815A (en) * | 2020-12-30 | 2021-04-16 | 北京捷通华声科技股份有限公司 | Text processing method and device, computer readable storage medium and processor |
WO2021173305A1 (en) * | 2020-02-28 | 2021-09-02 | Microsoft Technology Licensing, Llc | Automatically generating visual content |
CN113361281A (en) * | 2021-08-05 | 2021-09-07 | 北京明略软件系统有限公司 | White paper generation method, device, equipment and storage medium |
CN116611417A (en) * | 2023-05-26 | 2023-08-18 | 浙江兴旺宝明通网络有限公司 | Automatic article generating method, system, computer equipment and storage medium |
CN117521813A (en) * | 2023-11-20 | 2024-02-06 | 中诚华隆计算机技术有限公司 | Scenario generation method, device, equipment and chip based on knowledge graph |
CN117521628A (en) * | 2023-11-20 | 2024-02-06 | 中诚华隆计算机技术有限公司 | Script creation method, device, equipment and chip based on artificial intelligence |
CN117708350A (en) * | 2024-02-06 | 2024-03-15 | 成都草根有智创新科技有限公司 | Enterprise policy information association method and device and electronic equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101356526A (en) * | 2006-01-03 | 2009-01-28 | 伊斯曼柯达公司 | Method for generating a work of communication |
CN102999516A (en) * | 2011-09-15 | 2013-03-27 | 北京百度网讯科技有限公司 | Method and device for classifying text |
CN103049581A (en) * | 2013-01-21 | 2013-04-17 | 北京航空航天大学 | Web text classification method based on consistency clustering |
CN104850588A (en) * | 2015-04-24 | 2015-08-19 | 深圳市梦网科技股份有限公司 | Method and system for generating and publishing media content |
US20160103824A1 (en) * | 2014-10-10 | 2016-04-14 | Wriber Inc. | Method and system for transforming unstructured text to a suggestion |
JP6097428B1 (en) * | 2016-03-14 | 2017-03-15 | ナレッジスイート株式会社 | Report creation support system |
US20170078621A1 (en) * | 2015-09-16 | 2017-03-16 | Intel Corporation | Facilitating personal assistance for curation of multimedia and generation of stories at computing devices |
CN107133210A (en) * | 2017-04-20 | 2017-09-05 | 中国科学院上海高等研究院 | Scheme document creation method and system |
CN107408113A (en) * | 2015-03-31 | 2017-11-28 | 华为技术有限公司 | For analyzing the analysis engine and method of pre-generatmg data report |
-
2018
- 2018-04-13 CN CN201810331488.4A patent/CN108563620A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101356526A (en) * | 2006-01-03 | 2009-01-28 | 伊斯曼柯达公司 | Method for generating a work of communication |
CN102999516A (en) * | 2011-09-15 | 2013-03-27 | 北京百度网讯科技有限公司 | Method and device for classifying text |
CN103049581A (en) * | 2013-01-21 | 2013-04-17 | 北京航空航天大学 | Web text classification method based on consistency clustering |
US20160103824A1 (en) * | 2014-10-10 | 2016-04-14 | Wriber Inc. | Method and system for transforming unstructured text to a suggestion |
CN107408113A (en) * | 2015-03-31 | 2017-11-28 | 华为技术有限公司 | For analyzing the analysis engine and method of pre-generatmg data report |
CN104850588A (en) * | 2015-04-24 | 2015-08-19 | 深圳市梦网科技股份有限公司 | Method and system for generating and publishing media content |
US20170078621A1 (en) * | 2015-09-16 | 2017-03-16 | Intel Corporation | Facilitating personal assistance for curation of multimedia and generation of stories at computing devices |
JP6097428B1 (en) * | 2016-03-14 | 2017-03-15 | ナレッジスイート株式会社 | Report creation support system |
CN107133210A (en) * | 2017-04-20 | 2017-09-05 | 中国科学院上海高等研究院 | Scheme document creation method and system |
Non-Patent Citations (1)
Title |
---|
郑铁男等: "《数字编辑实训教程》", 30 September 2017, 知识产权出版社 * |
Cited By (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109597894A (en) * | 2018-09-30 | 2019-04-09 | 阿里巴巴集团控股有限公司 | A kind of correlation model generation method and device, a kind of data correlation method and device |
CN109597894B (en) * | 2018-09-30 | 2023-10-03 | 创新先进技术有限公司 | Correlation model generation method and device, and data correlation method and device |
CN109492112A (en) * | 2018-10-24 | 2019-03-19 | 北京百科康讯科技有限公司 | A kind of method of the computer aided writing scientific popular article of knowledge based map |
CN109284298A (en) * | 2018-11-09 | 2019-01-29 | 上海晏鼠计算机技术股份有限公司 | A kind of contents production system handled based on machine learning and big data |
CN109635260A (en) * | 2018-11-09 | 2019-04-16 | 北京百度网讯科技有限公司 | For generating the method, apparatus, equipment and storage medium of article template |
CN109635260B (en) * | 2018-11-09 | 2022-07-12 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for generating article template |
CN109584013A (en) * | 2018-11-30 | 2019-04-05 | 北京字节跳动网络技术有限公司 | The method and apparatus for generating article description information |
CN109584012A (en) * | 2018-11-30 | 2019-04-05 | 北京字节跳动网络技术有限公司 | Method and apparatus for generating article pushed information |
CN109584012B (en) * | 2018-11-30 | 2021-09-10 | 北京字节跳动网络技术有限公司 | Method and device for generating item push information |
CN109885821A (en) * | 2019-03-05 | 2019-06-14 | 中国联合网络通信集团有限公司 | Article writting method and device, computer storage medium based on artificial intelligence |
CN110059307B (en) * | 2019-04-15 | 2021-05-14 | 百度在线网络技术(北京)有限公司 | Writing method, device and server |
CN110059307A (en) * | 2019-04-15 | 2019-07-26 | 百度在线网络技术(北京)有限公司 | Writing method, device and server |
CN110046672A (en) * | 2019-04-24 | 2019-07-23 | 哈尔滨工程大学 | A kind of determining method of bank electronic channel exception transaction based on semi-supervised learning |
CN112446212A (en) * | 2019-08-29 | 2021-03-05 | 北京易车互联信息技术有限公司 | Article generation method and device, electronic equipment and storage medium |
CN112446212B (en) * | 2019-08-29 | 2024-05-28 | 北京易车互联信息技术有限公司 | Article generation method and device, electronic equipment and storage medium |
CN110765742A (en) * | 2019-09-10 | 2020-02-07 | 上海融盈数据科技有限公司 | Automatic manuscript writing system based on text analysis technology |
CN110705310A (en) * | 2019-09-20 | 2020-01-17 | 北京金山数字娱乐科技有限公司 | Article generation method and device |
CN110705310B (en) * | 2019-09-20 | 2023-07-18 | 北京金山数字娱乐科技有限公司 | Article generation method and device |
CN110688857A (en) * | 2019-10-08 | 2020-01-14 | 北京金山数字娱乐科技有限公司 | Article generation method and device |
CN110874313A (en) * | 2019-11-18 | 2020-03-10 | 北京百度网讯科技有限公司 | Writing tool testing method and device |
CN110765753A (en) * | 2019-12-27 | 2020-02-07 | 广东博智林机器人有限公司 | Method, system, computer device and storage medium for generating file |
CN111309866B (en) * | 2020-02-15 | 2023-09-15 | 深圳前海黑顿科技有限公司 | System and method for intelligently searching authoring materials by utilizing semantic fuzzy search |
CN111309866A (en) * | 2020-02-15 | 2020-06-19 | 深圳前海黑顿科技有限公司 | System and method for intelligently retrieving written materials by utilizing semantic fuzzy search |
WO2021173305A1 (en) * | 2020-02-28 | 2021-09-02 | Microsoft Technology Licensing, Llc | Automatically generating visual content |
CN111859887A (en) * | 2020-07-21 | 2020-10-30 | 北京北斗天巡科技有限公司 | Scientific and technological news automatic writing system based on deep learning |
CN112667815A (en) * | 2020-12-30 | 2021-04-16 | 北京捷通华声科技股份有限公司 | Text processing method and device, computer readable storage medium and processor |
CN113361281B (en) * | 2021-08-05 | 2021-11-02 | 北京明略软件系统有限公司 | White paper generation method, device, equipment and storage medium |
CN113361281A (en) * | 2021-08-05 | 2021-09-07 | 北京明略软件系统有限公司 | White paper generation method, device, equipment and storage medium |
CN116611417A (en) * | 2023-05-26 | 2023-08-18 | 浙江兴旺宝明通网络有限公司 | Automatic article generating method, system, computer equipment and storage medium |
CN116611417B (en) * | 2023-05-26 | 2023-11-21 | 浙江兴旺宝明通网络有限公司 | Automatic article generating method, system, computer equipment and storage medium |
CN117521813A (en) * | 2023-11-20 | 2024-02-06 | 中诚华隆计算机技术有限公司 | Scenario generation method, device, equipment and chip based on knowledge graph |
CN117521628A (en) * | 2023-11-20 | 2024-02-06 | 中诚华隆计算机技术有限公司 | Script creation method, device, equipment and chip based on artificial intelligence |
CN117521813B (en) * | 2023-11-20 | 2024-05-28 | 中诚华隆计算机技术有限公司 | Scenario generation method, device, equipment and chip based on knowledge graph |
CN117521628B (en) * | 2023-11-20 | 2024-05-28 | 中诚华隆计算机技术有限公司 | Script creation method, device, equipment and chip based on artificial intelligence |
CN117708350A (en) * | 2024-02-06 | 2024-03-15 | 成都草根有智创新科技有限公司 | Enterprise policy information association method and device and electronic equipment |
CN117708350B (en) * | 2024-02-06 | 2024-05-14 | 成都草根有智创新科技有限公司 | Enterprise policy information association method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108563620A (en) | The automatic writing method of text and system | |
Arora et al. | Character level embedding with deep convolutional neural network for text normalization of unstructured data for Twitter sentiment analysis | |
Bucur | Using opinion mining techniques in tourism | |
Hofmann et al. | Text mining and visualization: Case studies using open-source tools | |
Bauer et al. | Quantitive evaluation of Web site content and structure | |
US20190188326A1 (en) | Domain specific natural language understanding of customer intent in self-help | |
US9116985B2 (en) | Computer-implemented systems and methods for taxonomy development | |
US10366117B2 (en) | Computer-implemented systems and methods for taxonomy development | |
Barbosa et al. | Evaluating hotels rating prediction based on sentiment analysis services | |
Plank | Domain adaptation for parsing | |
Lalata et al. | A sentiment analysis model for faculty comment evaluation using ensemble machine learning algorithms | |
Yeasmin et al. | Study of abstractive text summarization techniques | |
Sandhiya et al. | A review of topic modeling and its application | |
Sohail et al. | Anti-social behavior detection in urdu language posts of social media | |
Mushtaq et al. | Educational data classification framework for community pedagogical content management using data mining | |
Nguyen et al. | A model of convolutional neural network combined with external knowledge to measure the question similarity for community question answering systems | |
CN113326348A (en) | Blog quality evaluation method and tool | |
Pertsas et al. | Ontology-driven information extraction from research publications | |
Hassanian-esfahani et al. | A survey on web news retrieval and mining | |
Pinto et al. | Intelligent and fuzzy systems applied to language & knowledge engineering | |
Philip et al. | A Brief Survey on Natural Language Processing Based Text Generation and Evaluation Techniques | |
DeVille et al. | Text as Data: Computational Methods of Understanding Written Expression Using SAS | |
Rojas-Simon et al. | Background of the ETS | |
Jakubícek et al. | Walking the tightrope between linguistics and language engineering | |
Tschuggnall et al. | What grammar tells about gender and age of authors |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20180921 |
|
WD01 | Invention patent application deemed withdrawn after publication |