CN107066599A - A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning - Google Patents
A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning Download PDFInfo
- Publication number
- CN107066599A CN107066599A CN201710259506.8A CN201710259506A CN107066599A CN 107066599 A CN107066599 A CN 107066599A CN 201710259506 A CN201710259506 A CN 201710259506A CN 107066599 A CN107066599 A CN 107066599A
- Authority
- CN
- China
- Prior art keywords
- data
- information
- company
- enterprise
- carried out
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/285—Clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
- G06F16/284—Relational databases
- G06F16/288—Entity relationship models
Abstract
The invention discloses a kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning, what methods described was used comprises the following steps that:Obtain company information, parsing data storage, confluence analysis data, set up business entity's knowledge base.The system includes incorporated business's data obtaining module, information extraction structurized module, keyword optimization retrieval module and similar matrix processing construction of knowledge base module.The present invention can solve that traditional mode classification coverage rate is complete and traditional classification imperfection and the more low technical problem of recall precision to mark enterprise searching system.
Description
Technical field
The present invention relates to a kind of information analysis retrieval technique in financial investment field.
Background technology
In financial investment field, investor needs to carry out target enterprise detailed traffic pattern analysis, financial analysis,
And rational enterprise value valuation.For the research of target company, it is often necessary to there is of the same trade or same domain rival firms
Enterprise operation data are as with reference to supporting, using suitable valuation mode model, to model or predict that the expected of the said firm passes through
Data are sought, potential investment target is found.Conventional searching same domain or company of the same trade mode, mainly passes through existing row
Industry disaggregated model, such as GICS (GICS), the global industry (RGS) of Russell, trade classification benchmark (ICB) investment
Type categorizing system, and the management type government industry categorizing system such as industrial sectors of national economy classification, marketing enterprises trade classification.Due to
The continuous progress of emerging technology, the incorporated business of multi-field conglomerate blending emerges in large numbers in succession, and traditional mode classification is difficult complete
Cover new technique field company.
Information retrieval technique is that the activity of the information resources related to information requirement is obtained from information resources set.Retrieval can
With based on full text or other indexes based on content.Web search engine is both most common information retrieval application.In letter
Cease in retrieving, inquiry each time can be identified sequence to information resources object, and arrange and store between different objects
Correlation degree and ranking information.Information object is typically the solid data of properties collection or database purchase, by original
The contents extraction of beginning information resources, sorts out the related information between effective entity and entity, is used as the straight of information retrieval
Connect process object.A kind of ripe search engine system would generally be according to match query degree each time, to being stored in system
Entity object carry out calculating marking, then ranking.The Query Result of user each time, can all show that respective queries are in the top
Entity and associated entity.Traditional classification imperfection and recall precision to mark enterprise searching system is relatively low.
Similar extraction is a kind of contents extraction mode similar or relevant documentation to its based on document content characteristic key.It is logical
Cross and the entity data bak progress document relevance built is estimated, the similitude ranking set up between entity can be effective
Retrieval rate is improved, useful information is returned.Conventional relativity measurement mode include vector space model, probabilistic model, with
And inference network model.Vector space model is modeled by carrying out the vector space based on keyword to document, by comparing not
With the vector space distance between document, Documents Similarity ranking is realized;Probabilistic model is by calculating searching keyword and document
Between dependent probability, using different priori and posteriority field empirical probability, based on Bayesian model, draw different keywords
Correlation degree between document, and similarity ranking is carried out to different document.Inference network model is that one kind possesses knowledge and pushed away
The similar to search model of reason ability, can there is provided the correlation degree between retrieval and document, Yi Jiwen based on different calculative strategies
Similarity ranking between shelves and document.Specific calculative strategy includes vector space, keyword weight probability etc..
Because there is above mentioned problem in traditional sorting technique, therefore, combining information retrieval technique, searching order scheduling algorithm meter
Automatic similar enterprise's searching classification system after calculation will use.
The content of the invention
It is an object of the invention to provide a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning and
System, to solve, traditional mode classification coverage rate is complete and traditional classification imperfection and inspection to mark enterprise searching system
The technical problem such as rope is less efficient.
In order to realize foregoing invention purpose, a kind of similar enterprise of the listed company inspection of knowledge based storehouse reasoning of the present invention
Rope sorting technique, use is comprised the following steps that:
1) obtains company information, and data collection is carried out to all enterprises of listed company, including listed company raises capital by floating shares explanation
Book, annual report, important announcement, financial report, industry research report, patent information, actionable information, information of inviting and submitting bids and enterprise
Industry highlight;
2) parses data storage, and the data crawled are resolved into appropriate format by resolver, stored into database,
Resolver containing type analyzer, format analyzer, to the data type and form for complexity, and are resolved to unification
Form;
3) confluence analysises data, are carried out to data with existing at data deduplication, content structure information extraction and information classification
Reason, for each furniture body enterprise, sets up business data portrait, is constituted from main business, joins holding company's relation, financial index
Angle, classified description is carried out to enterprise-like corporation;
4) sets up business entity's knowledge base, by using Chinese word segmentation, part-of-speech tagging, identification mark, rule match skill
Art, the structural analysis of paragraph and sentence level is carried out to company information, and extracts entity and relation;Pass through term vector mould afterwards
Type, and by inverted index, keyword optimization, similarity ranking, entity relationship matching step, set up business entity's knowledge base;
5) returns to the related to mark company information of target enterprise according to search key.
The parsing data storage is, according to the listed company's enterprise operation data got, for different type, to carry out
Parsing is extracted;Data are obtained by more than and are uniformly submitted to type resolver, for the data of different-format type, resolver is included
Corresponding data type interface module, dissection process is identified to corresponding data;Number is analyzed by format analyzer afterwards
According to different-format, various company datas are converted into unified form, be parsed after, it is necessary to store data into database
It is middle to preserve.
The confluence analysis data, on the data basis with unified form, in addition it is also necessary to further clear up data;
Firstly the need of to data deduplication, a large amount of description data, financial data, the news data included for company, first layer form
Also need to carry out cleaning detection to available data after analyzing and processing, remove the data after repeated data, duplicate removal still comprising a large amount of
The redundant datas such as useless label, form, in addition it is also necessary to the data after cleaning are carried out at extraction using rule-based identification technology
Reason, sifts out useful data, finally according to company's situation, and data are carried out mainly to include finance model, enterprise's contrast of the same trade or business, product
Category classification including structure, sales mode, client and market.
It is described to set up business entity's knowledge base, first, full-text index is set up to data, utilize distributed search engine technology
Data after handling structuring set up full-text index, carry out this word retrieval in full to related document, and by text data
Space vector is converted into, relevance score is carried out to text using vector model.
It is described to set up business entity's knowledge base, secondly, according to key word of the inquiry information extraction data chunks, utilize distribution
Search engine is retrieved to database, extracts associated companies data, constitutes a data chunks, and retrieval is optimized.
It is described to set up business entity's knowledge base, the 3rd, searching keyword storehouse, for ad hoc inquiry keyword, will be closed with it
The data chunks of connection organize foundation search spatial cache, lift the efficiency of search inquiry.
It is described to set up business entity's knowledge base, the 4th, Similarity Measure is carried out to data chunks, data chunks entered first
Row company information is modeled, and text data is converted into vector using term vector model, is carried out based on obtained vector matrix similar
Degree is calculated, and passes through Multilevel method in calculating process:Vectorization is carried out to company information with keyword-entity vector model, made
Company information is set up with Inverted Index Technique and indexed, search key is optimized, using Similarity Measure technology to enterprise
Industry similarity is optimized, and is completed entity relationship matching, is generated similarity matrix.
It is described to set up business entity's knowledge base, the 5th, the retrieval knot according to similarity matrix return similarity more than threshold value
Really.
A kind of similar enterprise of the listed company searching classification system of knowledge based storehouse reasoning, including:
Various commonly used company information data are carried out acquisition arrangement by incorporated business's data obtaining module;
Data parse form analysis module, the data crawled are resolved into unified form, wherein needing the class of analyze data
Type and form, for different data type and form, using different analytical algorithms, are resolved to
Unified form, is finally stored data into appropriate database;Information extraction structurized module, to unified form
Data carry out further confluence analysis, including data deduplication, information extraction, information classification algorithm;
Data after integration are set up full-text index by keyword optimization retrieval module based on distributed search engine, according to
The data that searching keyword can retrieve correlation constitute data chunks, and relevance score is carried out to data, improve recall precision;
Similar matrix handles construction of knowledge base module, for the similarity according to data chunks Computer Corp. data, wherein
Vector form then is converted the data into using term vector model, and pass through the row of falling first based on company information to data modeling
Index, keyword optimization, similarity ranking, entity relationship matching process, set up business entity's knowledge base, and the retrieval to input is closed
Keyword makes inferences matching.
Advantages of the present invention:
The present invention uses the inference strategy network model of knowledge based storehouse reasoning, according to the production of target enterprise and relevant enterprise
What product structure, main business service, rival, competition situation, trade cycle sensitivity, and statistical correlation degree were combined
Mode, similarity ranking is carried out to associated enterprise, to find out to mark enterprise, and to more global industry industrial chain, up and down
There is provided data basis for the investment analyses such as trip association.Imitated for the existing classification imperfection to mark enterprise searching system and retrieval
A kind of the problems such as rate is relatively low, it is proposed that similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning.This
Invention has mode classification coverage rate complete, and the classification to mark enterprise searching system is perfect, the advantages of recall precision is high.
Brief description of the drawings
Fig. 1 is to retrieve the method flow diagram of similar listed company in example.
Fig. 2 is the flow chart of parsing data storage and confluence analysis data in example.
Fig. 3 is the flow chart of generation company similarity matrix.
Fig. 4 is to retrieve the system flow chart of similar listed company in example.
Embodiment
The present invention is described in detail with reference to example, it is clear that described example is the certain embodiments of the application.Should
Understand, preferred embodiment described herein is merely to illustrate and explain the present invention, be not used to limit the application.Based on the application
Example, those skilled in the art obtains the protection domain that every other example belongs to the application.
Fig. 1 is the general flow chart of method, describes the operational process of similar company's search method.
101-104 listed companies business data source.According to data processing needs of the present invention, enterprise of listed company is disclosed
The information of channel issue is collected, and specifying information is reported including industry research, company's bulletin, financial report, related important new
Hear, and prospectus, annual report, great bulletin, actionable information, patent information etc. can cover what company's day-to-day operations changed
The information content;
105 listed company's acquisition of information.For above-mentioned separate sources data, it is determined that corresponding information acquiring pattern, such as row
The data such as industry research report, company's bulletin are often the PDF document of textual form, then need to be updated storage to specific document
Processing.The data such as financial report are the numeric data with label form after structuring, then need obtaining according to numeric data
Mode is taken, acquisition is updated in batches, and modeling is associated to identical structure of report field of same companies etc.;Actionable information,
Patent information etc. is Homepage Publishing data, then needs effectively to recognize structure of web page content, and extraction and analysis obtains useful data.
106 parsing data storages.The listed company's enterprise operation data got according to 105, for different type, are carried out
Parsing is extracted.Data are obtained by more than and are uniformly submitted to type resolver, for the data of different-format type, text-type data
Such as PDF format, Word format, structure data such as JSON forms, XML etc., web data such as HTML etc., resolver is contained
Corresponding data type interface module, dissection process is identified to corresponding data.Pass through format analyzer analyze data afterwards
Different-format, various company datas are converted into unified form, be parsed after, it is necessary to store data into database
Preserve;
107 confluence analysis data, on the data basis with unified form, in addition it is also necessary to further clear up data.
Firstly the need of to data deduplication, a large amount of description data, financial data, the news data included for company, usual first layer
Also need to carry out cleaning detection to available data after format analysis processing, remove repeated data, to improve data validity, mitigate
The burden of storage system.Data after duplicate removal are still comprising redundant datas such as a large amount of useless label, forms, in addition it is also necessary to use identification
Algorithmic technique carries out extraction process to the data after cleaning, sifts out useful data, finally data are classified, including financial mould
Type, enterprise's contrast of the same trade or business, product structure, sales mode, the classification such as client and market;
108 pairs of data set up full-text index.In order to improve the retrieval rate of data after processing, it is necessary to utilize distributed search
Data after engine technique is handled structuring set up full-text index, and this word retrieval in full is carried out to related document, and will
Text data is converted into space vector, and relevance score is carried out to text using vector model;
109, according to key word of the inquiry information extraction data chunks, are retrieved using distributed search engine to database,
Associated companies data are extracted, such as search key, company data document, keyword positional information constitute a data chunks,
Retrieval is optimized.
110 searching keyword storehouses are determined.For ad hoc inquiry keyword, data chunks associated with it are organized and built
Vertical search spatial cache, lifts the efficiency of search inquiry.
111 pairs of data chunks carry out Similarity Measure, carry out company information modeling to data chunks first, utilize term vector
Text data is converted into vector by model, Similarity Measure is carried out based on obtained vector matrix, by many in calculating process
Layer processing:Keyword-entity vector model is carried out, inverted index, keyword optimization, similarity ranking, entity relationship matching
Deng generation similarity matrix;
112 return to the retrieval result that similarity is more than threshold value according to similarity matrix.
Fig. 2 describes the flow that data storage and Data Integration analysis are parsed in the inventive method.
The 201 incorporated business's data obtained.According to 101-105, listed company's enterprise operation data needed for obtaining, bag
Industry research report, company's bulletin, financial report, related highlight, and prospectus, annual report, great bulletin are included, is told
Dispute information, patent information etc..
202 type analysis, the company information obtained to more than carries out type analysis.It is public for industry research report, listing
Document-type (such as PDF, Word) data such as department's bulletin, according to file structure feature, extract valid data content therein, including text
The useful informations such as sheet, picture, form.It is right according to specific structure feature information for the value structure type data such as financial report
Initial data carries out reprocessing processing, and original structure feature is recombinated, to generate the new of the recognizable processing of this patent system
Type structural data.For the structure of web page information data such as lawsuit, patent, its tag head need to be analyzed according to specific structure of web page
Portion's content, extracts useful information data, and recombinate structuring.
203 format analysis.The original incorporated business's data message of the different type according to 202, carries out corresponding form knot
Structureization processing.Text-type data such as PDF format, the useful informations such as content of text therein, chart are carried out to extract at formatting
Reason, generates unified structure content;The structured content such as JSON data such as corporate financial data, product information, main business information,
Pattern handling again is carried out to it;Network Page data such as corporate news, actionable information etc., by format analyzer by having in webpage
Imitate data and carry out unified extraction, reject useless format tags, screen useful information data.
204 data storages, the enterprise in the good data Cun Chudao databases of formatting structure, setting up corporate linkage is believed
Knowledge base is ceased, to improve data access efficiency.
205 data deduplications, duplicate removal cleaning treatment again is carried out for existing format data, using salted hash Salted, calculates number
According to informative abstract, repeated data is removed, the utilization ratio of incorporated business data is improved.
206 information extractions, for the structural data in incorporated business's information knowledge storehouse, for different demands, such as enterprise
Description, product structure, main business is constituted, financial statement, senior executive's information, patent information etc., carries out related content extraction.
207 information classifications, in the data basis that 206 procedure extractions go out, to corresponding contents carry out information classification, and with original
Beginning company information is associated.For each specific enterprise, corresponding business data portrait is generated, from multiple angles to enterprise
Company carries out classified description.
Fig. 3 describes the flow of Computer Corp.'s similarity matrix in the inventive method.
301 by flow described in Fig. 2, by the original enterprise-like corporation's data got, carries out structuring extraction process, obtains
Simplify the business data portrait of classification;
302 company information entities are extracted.Chinese word segmentation, part-of-speech tagging, identification mark, rule match etc. are used by comprehensive
Technology, the structural analysis of paragraph/sentence level is carried out to above-mentioned company information, and extracts entity therein and relation.
303 term vector models.Obtained business entity information is handled according to 302 processes, it is right using term vector model is used
It carries out text vector matrixing processing, wherein the dimension of vector is number of entities in text, the overwhelming majority is 0 in vector, certain
A little dimensions are 1, and numerical value vector is converted the text to by such mode, enabling carry out next a series of calculate;
304 business entity's knowledge bases.For the business entity information extracted, pass through a series of in-depths of 305-308
Processing, construction, which is set up, to be possessed business entity's Association repository of inferential capability there is provided for covering Shenzhen stock market, stock markets of Shanghai, new three plates institute
There is enterprise of listed company to the inference data chain needed for mark company automatic recognition classification system
305 inverted indexs.For the business entity information built, with reference to corresponding search key, the row of falling is built
Index structure, raising retrieval associates matching degree with result.Row's keyword-entity index can regard a chained list number as
Group, the gauge outfit of each chained list includes keyword, and its subsequent cell then includes all entity vector models including this keyword,
And some other information.These information can be entity vector in the word frequency or entity vector in the word
The information such as position.
306 keywords optimize.According to the keyword set up in 305-entity inverted index model, for retrieval each time
As a result, keyword occurrence number and weight are optimized.If keyword occurrence number in some entity vector is got over
It is many, then this word is considered as more important.If a keyword occurs in more entity vectors, then this word
The effect of discernibly matrix is lower, and then its importance should also be as corresponding reduction.The entity vector model dimension of one enterprise is got over
It is high, then its number of times for some keyword occur may be higher, and differentiation of each keyword to this entity vector is acted on
It is lower, certain drop power should be given these keywords accordingly.
307 similarity rankings.It is constantly excellent by being carried out to weight of the keyword in different business entity's vector models
Change amendment, marking is ranked up to business entity's vector associated by same keyword, the corresponding business entity of keyword is set up
Ranking collection of illustrative plates, finds out in same domain of the same trade, and the difference index association such as product business business revenue is most strong to mark enterprise-like corporation.
308 entity relationships are matched.It is crucial according to different retrievals on the basis of 307 foundation are to mark enterprise ranking collection of illustrative plates
Word, pattern of enterprises index carries out classification and matching the association business entity, such as product structure, main business market, OK
Region residing for industry and cycle etc., carry out different relationship match processing, and foundation can make inferences retrieval according to different keywords
Business entity's relational knowledge base.
309 inference patterns.By above 301--308 handling process, complete business entity's knowledge base is set up, for
Different search keys, knowledge base voluntarily reasoning can obtain the corresponding similar enterprise's matching result of target enterprise, and can root
According to different concern classifications, reasoning draw certain subdivision scene to mark enterprise-like corporation, for industry and enterprise analysis have very big
Help.
310 pairs of mark matching results.According to search key, output is to mark matching result.
Fig. 4 is the system flow chart realized according to the inventive method, describes the overall operation of similar company's searching system
Flow.
401 incorporated business's data obtaining modules.Acquisition arrangement is carried out to various commonly used company information data;
402 data parse form analysis module.The data crawled are resolved into unified form, wherein needing analyze data
Type and form, for different data type and form, using different analytical algorithms, are resolved to unified form, most
Store data into afterwards in appropriate database;
403 information extraction structurized modules.Data to unified form borrow the confluence analysis of a step, including number
According to duplicate removal, information extraction, information classification scheduling algorithm;
404 keywords optimization retrieval module.Full-text index, root are set up to the data after integration based on distributed search engine
The data that can retrieve correlation according to searching keyword constitute data chunks, and vector space model, BM25 algorithms are related among these
Deng data are carried out with relevance score, recall precision is improved;
405 similar matrixes handle construction of knowledge base module.For the similarity according to data chunks Computer Corp. data, its
In then to convert the data into vector form using term vector model first based on company information to data modeling, and by falling
The processes such as index, keyword optimization, the matching of similarity ranking, entity relationship are arranged, business entity's knowledge base are set up, the inspection to input
Rope keyword makes inferences matching.
Claims (9)
1. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning, use is comprised the following steps that:
1) obtains company information, and all enterprises of listed company are carried out with data collection, including listed company's prospectus, year
Spend report, important announcement, financial report, industry research report, patent information, actionable information, information of inviting and submitting bids and enterprise's weight
Want news;
2) parses data storage, and the data crawled are resolved into appropriate format by resolver, stored into database, parsing
Device containing type analyzer, format analyzer, to the data type and form for complexity, and are resolved to unified lattice
Formula;
3) confluence analysises data, data deduplication, content structure information are carried out to data with existing and is extracted and information classification processing, pin
To each furniture body enterprise, business data portrait is set up, constituted from main business, join holding company's relation, financial index angle,
Classified description is carried out to enterprise-like corporation;
4) sets up business entity's knowledge base, right by using Chinese word segmentation, part-of-speech tagging, identification mark, rule match technology
Company information carries out the structural analysis of paragraph and sentence level, and extracts entity and relation;Afterwards by term vector model, and
By inverted index, keyword optimization, similarity ranking, entity relationship matching step, business entity's knowledge base is set up;
5) returns to the related to mark company information of target enterprise according to search key.
2. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, solution
Analysis data storage is, according to the listed company's enterprise operation data got, for different type, to carry out parsing extraction;By more than
Obtain data and be uniformly submitted to type resolver, for the data of different-format type, resolver contains corresponding data class
Type interface module, dissection process is identified to corresponding data;, will afterwards by the different-format of format analyzer analyze data
Various company datas are converted into unified form, are preserved after being parsed, it is necessary to store data into database.
3. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, whole
Analyze data is closed, on the data basis with unified form, in addition it is also necessary to further clear up data;Firstly the need of to data
Duplicate removal, a large amount of description data, financial data, the news data included for company is also needed after the processing of first layer format analysis
Cleaning detection is carried out to available data, remove the data after repeated data, duplicate removal still comprising a large amount of useless label, forms etc.
Redundant data, in addition it is also necessary to extraction process is carried out to the data after cleaning using rule-based identification technology, useful data is sifted out,
Finally according to company's situation, data are carried out mainly to include finance model, enterprise of the same trade or business contrast, product structure, sales mode, visitor
Family and the category classification including market.
4. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, institute
State and set up business entity's knowledge base, full-text index is set up to data, after being handled using distributed search engine technology structuring
Data set up full-text index, this word retrieval, and text data is converted into space vector in full is carried out to related document,
Relevance score is carried out to text using vector model.
5. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, institute
State and set up business entity's knowledge base, according to key word of the inquiry information extraction data chunks, using distributed search engine to data
Storehouse is retrieved, and extracts associated companies data, constitutes a data chunks, and retrieval is optimized.
6. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, institute
State and set up business entity's knowledge base, searching keyword storehouse, for ad hoc inquiry keyword, by data chunks tissue associated with it
Get up to set up search spatial cache, lift the efficiency of search inquiry.
7. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, institute
State and set up business entity's knowledge base, Similarity Measure is carried out to data chunks, company information modeling is carried out to data chunks first,
Text data is converted into vector using term vector model, Similarity Measure is carried out based on obtained vector matrix, calculated
Pass through Multilevel method in journey:Vectorization is carried out to company information with keyword-entity vector model, Inverted Index Technique pair is used
Company information sets up index, and search key is optimized, enterprise's similarity optimized using Similarity Measure technology,
Entity relationship matching is completed, similarity matrix is generated.
8. a kind of similar enterprise of the listed company searching classification method of knowledge based storehouse reasoning according to claim 1, institute
State and set up business entity's knowledge base, the retrieval result that similarity is more than threshold value is returned to according to similarity matrix.
9. a kind of similar enterprise of the listed company searching classification system of knowledge based storehouse reasoning, it is characterised in that including:
Various commonly used company information data are carried out acquisition arrangement by incorporated business's data obtaining module;
Data parse form analysis module, and the data crawled are resolved into unified form, wherein need analyze data type and
Form, for different data type and form, using different analytical algorithms, is resolved to unified form, finally by number
According to storage into appropriate database;Information extraction structurized module, the data to unified form carry out further integration point
Analysis, including data deduplication, information extraction, information classification algorithm;
Data after integration are set up full-text index, according to inquiry by keyword optimization retrieval module based on distributed search engine
The data that keyword can retrieve correlation constitute data chunks, and relevance score is carried out to data, improve recall precision;
Similar matrix handles construction of knowledge base module, for the similarity according to data chunks Computer Corp. data, wherein will be first
Based on company information to data modeling, then convert the data into vector form using term vector model, and by inverted index,
Keyword optimization, similarity ranking, entity relationship matching process, set up business entity's knowledge base, to the search key of input
Make inferences matching.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710259506.8A CN107066599B (en) | 2017-04-20 | 2017-04-20 | Similar listed company enterprise retrieval classification method and system based on knowledge base reasoning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710259506.8A CN107066599B (en) | 2017-04-20 | 2017-04-20 | Similar listed company enterprise retrieval classification method and system based on knowledge base reasoning |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107066599A true CN107066599A (en) | 2017-08-18 |
CN107066599B CN107066599B (en) | 2021-11-30 |
Family
ID=59599954
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710259506.8A Active CN107066599B (en) | 2017-04-20 | 2017-04-20 | Similar listed company enterprise retrieval classification method and system based on knowledge base reasoning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107066599B (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818130A (en) * | 2017-09-15 | 2018-03-20 | 深圳市电陶思创科技有限公司 | The method for building up and system of a kind of search engine |
CN107844960A (en) * | 2017-11-22 | 2018-03-27 | 辅投帮(武汉)科技有限公司 | A kind of investment analysis tools of automatic intelligent analysis report of business plan |
CN108073692A (en) * | 2017-12-06 | 2018-05-25 | 国云科技股份有限公司 | A kind of enterprise's ranking system and its implementation |
CN108563783A (en) * | 2018-04-25 | 2018-09-21 | 张艳 | A kind of financial analysis management system and method based on big data |
CN108596439A (en) * | 2018-03-29 | 2018-09-28 | 北京中兴通网络科技股份有限公司 | A kind of the business risk prediction technique and system of knowledge based collection of illustrative plates |
CN108615124A (en) * | 2018-05-11 | 2018-10-02 | 北京窝头网络科技有限公司 | Valuation of enterprise method and system based on word frequency analysis |
CN109145081A (en) * | 2018-07-27 | 2019-01-04 | 安康市惠企财税服务有限公司 | A kind of financial data search method and system |
CN109165337A (en) * | 2018-10-17 | 2019-01-08 | 珠海市智图数研信息技术有限公司 | A kind of method and system of knowledge based map construction bidding field association analysis |
CN109213867A (en) * | 2018-10-26 | 2019-01-15 | 湖北大学 | A kind of mass knowledge base construction method precisely predicted towards big data |
CN109241046A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of inventory information recognition methods of negotiation by draft robot and identifier |
CN109359817A (en) * | 2018-09-13 | 2019-02-19 | 江苏站企动网络科技有限公司 | A kind of business information analysis management system |
CN109376273A (en) * | 2018-09-21 | 2019-02-22 | 平安科技(深圳)有限公司 | Company information map construction method, apparatus, computer equipment and storage medium |
CN109558492A (en) * | 2018-10-16 | 2019-04-02 | 中山大学 | A kind of listed company's knowledge mapping construction method and device suitable for event attribution |
CN109598705A (en) * | 2018-11-19 | 2019-04-09 | 江苏科技大学 | A kind of inspection procedure automatic generation method based on detection feature |
CN109657066A (en) * | 2018-11-19 | 2019-04-19 | 平安科技(深圳)有限公司 | Knowledge mapping construction method, device and computer equipment based on multi-angle of view |
CN109785144A (en) * | 2019-01-18 | 2019-05-21 | 国家电网有限公司 | A kind of assets classes method, apparatus, equipment and medium |
CN110020660A (en) * | 2017-12-06 | 2019-07-16 | 埃森哲环球解决方案有限公司 | Use the integrity assessment of the unstructured process of artificial intelligence (AI) technology |
CN110110044A (en) * | 2019-04-11 | 2019-08-09 | 广州探迹科技有限公司 | A kind of method of company information combined sorting |
CN110162590A (en) * | 2019-02-22 | 2019-08-23 | 北京捷风数据技术有限公司 | A kind of database displaying method and device thereof of calling for tenders of project text combination economic factor |
CN110427547A (en) * | 2018-04-26 | 2019-11-08 | 观相科技(上海)有限公司 | A kind of search system and searching method based on industrial characteristic |
CN110532383A (en) * | 2019-07-18 | 2019-12-03 | 中山大学 | A kind of patent text classification method based on intensified learning |
CN110737749A (en) * | 2019-10-11 | 2020-01-31 | 软通动力信息技术有限公司 | Entrepreneurship plan evaluation method, entrepreneurship plan evaluation device, computer equipment and storage medium |
CN110795425A (en) * | 2019-10-31 | 2020-02-14 | 上海义缘网络科技有限公司 | Method, device, equipment and medium for cleaning and merging customs data |
CN110879829A (en) * | 2019-11-26 | 2020-03-13 | 杭州皓智天诚信息科技有限公司 | Intellectual property big data service intelligent system |
CN111008265A (en) * | 2019-12-03 | 2020-04-14 | 腾讯云计算(北京)有限责任公司 | Enterprise information searching method and device |
CN111080132A (en) * | 2019-12-18 | 2020-04-28 | 北京智识企业管理咨询有限公司 | Industry chain analysis system and method based on big data |
CN111125185A (en) * | 2019-11-25 | 2020-05-08 | 泰康保险集团股份有限公司 | Data processing method, device, medium and electronic equipment |
CN111177189A (en) * | 2019-12-20 | 2020-05-19 | 航天云网科技发展有限责任公司 | Client optimization system and method based on user behavior analysis |
CN111176650A (en) * | 2018-11-09 | 2020-05-19 | 阿里巴巴集团控股有限公司 | Parser generation method, search method, server, and storage medium |
CN111183421A (en) * | 2017-10-06 | 2020-05-19 | 株式会社东芝 | Service providing system, business analysis support system, method, and program |
CN111737421A (en) * | 2020-08-07 | 2020-10-02 | 杭州六棱镜知识产权科技有限公司 | Intellectual property big data information retrieval system and storage medium |
CN112115314A (en) * | 2020-09-16 | 2020-12-22 | 江苏开拓信息与系统有限公司 | General government affair big data aggregation retrieval system and construction method |
CN112183090A (en) * | 2020-10-09 | 2021-01-05 | 浪潮云信息技术股份公司 | Method for calculating entity relevance based on word network |
CN112182223A (en) * | 2020-10-12 | 2021-01-05 | 浙江工业大学 | Enterprise industry classification method and system based on domain ontology |
CN112214572A (en) * | 2020-10-20 | 2021-01-12 | 济南浪潮高新科技投资发展有限公司 | Method for secondarily extracting entities in resume analysis |
CN112434158A (en) * | 2020-11-13 | 2021-03-02 | 北京创业光荣信息科技有限责任公司 | Enterprise label acquisition method and device, storage medium and computer equipment |
CN112434665A (en) * | 2020-12-12 | 2021-03-02 | 广东电力信息科技有限公司 | Method and device for intelligently identifying financial data in image based on machine learning |
CN112507201A (en) * | 2020-11-03 | 2021-03-16 | 国网浙江省电力有限公司台州供电公司 | Search engine construction and search method based on NLP (non-line segment) retrieval analysis technology |
CN112612937A (en) * | 2020-12-07 | 2021-04-06 | 深圳价值在线信息科技股份有限公司 | Associated information acquisition method and equipment |
CN112650951A (en) * | 2020-12-21 | 2021-04-13 | 撼地数智(重庆)科技有限公司 | Enterprise similarity matching method, system and computing device |
CN112734493A (en) * | 2021-01-18 | 2021-04-30 | 科技谷(厦门)信息技术有限公司 | Industry monitoring analysis platform |
CN113742496A (en) * | 2021-09-10 | 2021-12-03 | 国网江苏省电力有限公司电力科学研究院 | Power knowledge learning system and method based on heterogeneous resource fusion |
CN116578677A (en) * | 2023-07-14 | 2023-08-11 | 高密市中医院 | Retrieval system and method for medical examination information |
CN117057942A (en) * | 2023-10-12 | 2023-11-14 | 之江实验室科技控股有限公司 | Intelligent financial decision big data analysis system |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102096845A (en) * | 2009-12-10 | 2011-06-15 | 黑龙江省森林工程与环境研究所 | Knowledge base full text search engine system for classified forest management |
CN104008107A (en) * | 2013-02-25 | 2014-08-27 | 成都勤智数码科技股份有限公司 | Implement method of knowledge base on operation and maintenance management |
CN104834668A (en) * | 2015-03-13 | 2015-08-12 | 浙江奇道网络科技有限公司 | Position recommendation system based on knowledge base |
CN104834736A (en) * | 2015-05-19 | 2015-08-12 | 深圳证券信息有限公司 | Method and device for establishing index database and retrieval method, device and system |
CN106126695A (en) * | 2016-06-30 | 2016-11-16 | 张春生 | A kind of similar case search method and device |
CN106156104A (en) * | 2015-04-02 | 2016-11-23 | 北京奇虎科技有限公司 | Crawl the method and device of corporate intranet information |
CN106296312A (en) * | 2016-08-30 | 2017-01-04 | 江苏名通信息科技有限公司 | Online education resource recommendation system based on social media |
-
2017
- 2017-04-20 CN CN201710259506.8A patent/CN107066599B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102096845A (en) * | 2009-12-10 | 2011-06-15 | 黑龙江省森林工程与环境研究所 | Knowledge base full text search engine system for classified forest management |
CN104008107A (en) * | 2013-02-25 | 2014-08-27 | 成都勤智数码科技股份有限公司 | Implement method of knowledge base on operation and maintenance management |
CN104834668A (en) * | 2015-03-13 | 2015-08-12 | 浙江奇道网络科技有限公司 | Position recommendation system based on knowledge base |
CN106156104A (en) * | 2015-04-02 | 2016-11-23 | 北京奇虎科技有限公司 | Crawl the method and device of corporate intranet information |
CN104834736A (en) * | 2015-05-19 | 2015-08-12 | 深圳证券信息有限公司 | Method and device for establishing index database and retrieval method, device and system |
CN106126695A (en) * | 2016-06-30 | 2016-11-16 | 张春生 | A kind of similar case search method and device |
CN106296312A (en) * | 2016-08-30 | 2017-01-04 | 江苏名通信息科技有限公司 | Online education resource recommendation system based on social media |
Non-Patent Citations (3)
Title |
---|
赵民等: "《基于流程的知识工程与创新》", 31 January 2016 * |
鲍捷: "智能金融的核心引擎_一览与前瞻", 《软件和集成技术》 * |
鲍捷: "知识图谱如何助力实现智能金融", 《金卡工程》 * |
Cited By (62)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107818130A (en) * | 2017-09-15 | 2018-03-20 | 深圳市电陶思创科技有限公司 | The method for building up and system of a kind of search engine |
CN111183421B (en) * | 2017-10-06 | 2023-11-28 | 株式会社东芝 | Service providing system, business analysis supporting system, method and recording medium |
CN111183421A (en) * | 2017-10-06 | 2020-05-19 | 株式会社东芝 | Service providing system, business analysis support system, method, and program |
CN107844960A (en) * | 2017-11-22 | 2018-03-27 | 辅投帮(武汉)科技有限公司 | A kind of investment analysis tools of automatic intelligent analysis report of business plan |
CN107844960B (en) * | 2017-11-22 | 2020-12-01 | 辅投帮(武汉)科技有限公司 | Investment analysis tool for automatically and intelligently analyzing business plan |
CN108073692B (en) * | 2017-12-06 | 2021-09-21 | 国云科技股份有限公司 | Method for implementing enterprise ranking system |
CN110020660B (en) * | 2017-12-06 | 2023-05-09 | 埃森哲环球解决方案有限公司 | Integrity assessment of unstructured processes using Artificial Intelligence (AI) techniques |
US11574204B2 (en) | 2017-12-06 | 2023-02-07 | Accenture Global Solutions Limited | Integrity evaluation of unstructured processes using artificial intelligence (AI) techniques |
CN110020660A (en) * | 2017-12-06 | 2019-07-16 | 埃森哲环球解决方案有限公司 | Use the integrity assessment of the unstructured process of artificial intelligence (AI) technology |
CN108073692A (en) * | 2017-12-06 | 2018-05-25 | 国云科技股份有限公司 | A kind of enterprise's ranking system and its implementation |
CN108596439A (en) * | 2018-03-29 | 2018-09-28 | 北京中兴通网络科技股份有限公司 | A kind of the business risk prediction technique and system of knowledge based collection of illustrative plates |
CN108563783A (en) * | 2018-04-25 | 2018-09-21 | 张艳 | A kind of financial analysis management system and method based on big data |
CN110427547A (en) * | 2018-04-26 | 2019-11-08 | 观相科技(上海)有限公司 | A kind of search system and searching method based on industrial characteristic |
CN108615124A (en) * | 2018-05-11 | 2018-10-02 | 北京窝头网络科技有限公司 | Valuation of enterprise method and system based on word frequency analysis |
CN108615124B (en) * | 2018-05-11 | 2022-02-01 | 北京窝头网络科技有限公司 | Enterprise evaluation method and system based on word frequency analysis |
CN109145081A (en) * | 2018-07-27 | 2019-01-04 | 安康市惠企财税服务有限公司 | A kind of financial data search method and system |
CN109241046A (en) * | 2018-08-30 | 2019-01-18 | 天津做票君机器人科技有限公司 | A kind of inventory information recognition methods of negotiation by draft robot and identifier |
CN109359817A (en) * | 2018-09-13 | 2019-02-19 | 江苏站企动网络科技有限公司 | A kind of business information analysis management system |
CN109376273B (en) * | 2018-09-21 | 2024-02-27 | 平安科技(深圳)有限公司 | Enterprise information map construction method, enterprise information map construction device, computer equipment and storage medium |
CN109376273A (en) * | 2018-09-21 | 2019-02-22 | 平安科技(深圳)有限公司 | Company information map construction method, apparatus, computer equipment and storage medium |
CN109558492A (en) * | 2018-10-16 | 2019-04-02 | 中山大学 | A kind of listed company's knowledge mapping construction method and device suitable for event attribution |
CN109165337B (en) * | 2018-10-17 | 2021-10-15 | 珠海市智图数研信息技术有限公司 | Method and system for establishing bid and ask field association analysis based on knowledge graph |
CN109165337A (en) * | 2018-10-17 | 2019-01-08 | 珠海市智图数研信息技术有限公司 | A kind of method and system of knowledge based map construction bidding field association analysis |
CN109213867A (en) * | 2018-10-26 | 2019-01-15 | 湖北大学 | A kind of mass knowledge base construction method precisely predicted towards big data |
CN111176650A (en) * | 2018-11-09 | 2020-05-19 | 阿里巴巴集团控股有限公司 | Parser generation method, search method, server, and storage medium |
CN111176650B (en) * | 2018-11-09 | 2023-04-18 | 阿里巴巴集团控股有限公司 | Parser generation method, search method, server, and storage medium |
CN109598705A (en) * | 2018-11-19 | 2019-04-09 | 江苏科技大学 | A kind of inspection procedure automatic generation method based on detection feature |
CN109657066A (en) * | 2018-11-19 | 2019-04-19 | 平安科技(深圳)有限公司 | Knowledge mapping construction method, device and computer equipment based on multi-angle of view |
CN109598705B (en) * | 2018-11-19 | 2023-06-23 | 江苏科技大学 | Automatic generation method of inspection procedure based on detection characteristics |
CN109785144A (en) * | 2019-01-18 | 2019-05-21 | 国家电网有限公司 | A kind of assets classes method, apparatus, equipment and medium |
CN110162590A (en) * | 2019-02-22 | 2019-08-23 | 北京捷风数据技术有限公司 | A kind of database displaying method and device thereof of calling for tenders of project text combination economic factor |
CN110110044B (en) * | 2019-04-11 | 2020-05-05 | 广州探迹科技有限公司 | Method for enterprise information combination screening |
CN110110044A (en) * | 2019-04-11 | 2019-08-09 | 广州探迹科技有限公司 | A kind of method of company information combined sorting |
CN110532383A (en) * | 2019-07-18 | 2019-12-03 | 中山大学 | A kind of patent text classification method based on intensified learning |
CN110737749A (en) * | 2019-10-11 | 2020-01-31 | 软通动力信息技术有限公司 | Entrepreneurship plan evaluation method, entrepreneurship plan evaluation device, computer equipment and storage medium |
CN110737749B (en) * | 2019-10-11 | 2022-09-27 | 软通智慧信息技术有限公司 | Entrepreneurship plan evaluation method, entrepreneurship plan evaluation device, computer equipment and storage medium |
CN110795425B (en) * | 2019-10-31 | 2023-04-28 | 上海义缘网络科技有限公司 | Customs data cleaning and merging method, device, equipment and medium |
CN110795425A (en) * | 2019-10-31 | 2020-02-14 | 上海义缘网络科技有限公司 | Method, device, equipment and medium for cleaning and merging customs data |
CN111125185A (en) * | 2019-11-25 | 2020-05-08 | 泰康保险集团股份有限公司 | Data processing method, device, medium and electronic equipment |
CN110879829A (en) * | 2019-11-26 | 2020-03-13 | 杭州皓智天诚信息科技有限公司 | Intellectual property big data service intelligent system |
CN111008265A (en) * | 2019-12-03 | 2020-04-14 | 腾讯云计算(北京)有限责任公司 | Enterprise information searching method and device |
CN111008265B (en) * | 2019-12-03 | 2023-03-28 | 腾讯云计算(北京)有限责任公司 | Enterprise information searching method and device |
CN111080132A (en) * | 2019-12-18 | 2020-04-28 | 北京智识企业管理咨询有限公司 | Industry chain analysis system and method based on big data |
CN111177189B (en) * | 2019-12-20 | 2024-04-05 | 北京航天云路有限公司 | Client optimization system and method based on user behavior analysis |
CN111177189A (en) * | 2019-12-20 | 2020-05-19 | 航天云网科技发展有限责任公司 | Client optimization system and method based on user behavior analysis |
CN111737421A (en) * | 2020-08-07 | 2020-10-02 | 杭州六棱镜知识产权科技有限公司 | Intellectual property big data information retrieval system and storage medium |
CN112115314A (en) * | 2020-09-16 | 2020-12-22 | 江苏开拓信息与系统有限公司 | General government affair big data aggregation retrieval system and construction method |
CN112183090A (en) * | 2020-10-09 | 2021-01-05 | 浪潮云信息技术股份公司 | Method for calculating entity relevance based on word network |
CN112182223A (en) * | 2020-10-12 | 2021-01-05 | 浙江工业大学 | Enterprise industry classification method and system based on domain ontology |
CN112214572A (en) * | 2020-10-20 | 2021-01-12 | 济南浪潮高新科技投资发展有限公司 | Method for secondarily extracting entities in resume analysis |
CN112214572B (en) * | 2020-10-20 | 2022-11-01 | 山东浪潮科学研究院有限公司 | Method for secondarily extracting entities in resume analysis |
CN112507201A (en) * | 2020-11-03 | 2021-03-16 | 国网浙江省电力有限公司台州供电公司 | Search engine construction and search method based on NLP (non-line segment) retrieval analysis technology |
CN112434158A (en) * | 2020-11-13 | 2021-03-02 | 北京创业光荣信息科技有限责任公司 | Enterprise label acquisition method and device, storage medium and computer equipment |
CN112612937A (en) * | 2020-12-07 | 2021-04-06 | 深圳价值在线信息科技股份有限公司 | Associated information acquisition method and equipment |
CN112434665A (en) * | 2020-12-12 | 2021-03-02 | 广东电力信息科技有限公司 | Method and device for intelligently identifying financial data in image based on machine learning |
CN112650951A (en) * | 2020-12-21 | 2021-04-13 | 撼地数智(重庆)科技有限公司 | Enterprise similarity matching method, system and computing device |
CN112734493A (en) * | 2021-01-18 | 2021-04-30 | 科技谷(厦门)信息技术有限公司 | Industry monitoring analysis platform |
CN113742496A (en) * | 2021-09-10 | 2021-12-03 | 国网江苏省电力有限公司电力科学研究院 | Power knowledge learning system and method based on heterogeneous resource fusion |
CN116578677A (en) * | 2023-07-14 | 2023-08-11 | 高密市中医院 | Retrieval system and method for medical examination information |
CN116578677B (en) * | 2023-07-14 | 2023-09-15 | 高密市中医院 | Retrieval system and method for medical examination information |
CN117057942A (en) * | 2023-10-12 | 2023-11-14 | 之江实验室科技控股有限公司 | Intelligent financial decision big data analysis system |
CN117057942B (en) * | 2023-10-12 | 2024-01-30 | 之江实验室科技控股有限公司 | Intelligent financial decision big data analysis system |
Also Published As
Publication number | Publication date |
---|---|
CN107066599B (en) | 2021-11-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107066599A (en) | A kind of similar enterprise of the listed company searching classification method and system of knowledge based storehouse reasoning | |
Deng et al. | A study of supervised term weighting scheme for sentiment analysis | |
US11663254B2 (en) | System and engine for seeded clustering of news events | |
Noh et al. | Keyword selection and processing strategy for applying text mining to patent analysis | |
Xie et al. | A novel text mining approach for scholar information extraction from web content in Chinese | |
CN102609512A (en) | System and method for heterogeneous information mining and visual analysis | |
Ahmadov et al. | Towards a hybrid imputation approach using web tables | |
Loudcher et al. | Combining OLAP and information networks for bibliographic data analysis: a survey | |
DE102012221251A1 (en) | Semantic and contextual search of knowledge stores | |
CN106484813A (en) | A kind of big data analysis system and method | |
CN111737421A (en) | Intellectual property big data information retrieval system and storage medium | |
CN114880486A (en) | Industry chain identification method and system based on NLP and knowledge graph | |
CN114254201A (en) | Recommendation method for science and technology project review experts | |
De et al. | An introduction to data mining in social networks | |
CA2956627A1 (en) | System and engine for seeded clustering of news events | |
Bhardwaj et al. | Review of text mining techniques | |
Dehghan et al. | Mining shape of expertise: A novel approach based on convolutional neural network | |
Chen et al. | Exploring technology opportunities and evolution of IoT-related logistics services with text mining | |
CN114896423A (en) | Construction method and system of enterprise basic information knowledge graph | |
CN106909626A (en) | Improved Decision Tree Algorithm realizes search engine optimization technology | |
Liao et al. | Improving farm management optimization: Application of text data analysis and semantic networks | |
CN115953041A (en) | Construction scheme and system of operator policy system | |
Panagopoulos et al. | Scientometrics for success and influence in the microsoft academic graph | |
Nogales et al. | Measuring vocabulary use in the Linked Data Cloud | |
CN113127650A (en) | Technical map construction method and system based on map database |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |