CN110297829A - A kind of text searching method and system towards specific industry structuring business datum - Google Patents

A kind of text searching method and system towards specific industry structuring business datum Download PDF

Info

Publication number
CN110297829A
CN110297829A CN201910558557.XA CN201910558557A CN110297829A CN 110297829 A CN110297829 A CN 110297829A CN 201910558557 A CN201910558557 A CN 201910558557A CN 110297829 A CN110297829 A CN 110297829A
Authority
CN
China
Prior art keywords
data
text
business datum
database
specific industry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910558557.XA
Other languages
Chinese (zh)
Inventor
涂腾飞
张进
余伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Unisinsight Technology Co Ltd
Original Assignee
Chongqing Unisinsight Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Unisinsight Technology Co Ltd filed Critical Chongqing Unisinsight Technology Co Ltd
Priority to CN201910558557.XA priority Critical patent/CN110297829A/en
Publication of CN110297829A publication Critical patent/CN110297829A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The invention discloses a kind of text searching method towards specific industry structuring business datum, which includes: to extract data into interim table from service database;The data extracted from the service database are converted;Inverted index is established for switched data;The text of input is subjected to full-text search.The present invention solves special number in service database, and the fields such as specialized vocabulary segment the problem of inaccuracy in full-text search, make full-text search that can also obtain ideal result in structuring business datum.The characteristics of based on inverted index, retrieval rate of the present invention are significantly faster than that the speed that structural data is inquired in a manner of SQL in relevant database.

Description

A kind of text searching method and system towards specific industry structuring business datum
Technical field
The invention belongs to information retrieval fields, and in particular to a kind of to tie towards specific industry (such as public security, traffic, education) The text searching method and system of structure business datum.
Background technique
With big data, the technologies such as artificial intelligence flourish and the reduction of carrying cost, people more pay close attention to pair Promote production and development in the Collection utilization of data to open up new business scope.In face of huge data, how to be counted According to retrieval, be a vital project.There are mainly two types of current existing retrieval modes, and one is be based on structuring number According to the accurate inquiry carried out using SQL statement, another kind is based on the full-text search for establishing inverted index after text participle.Complete Literary searching field, most representative tool is Elasticsearch and Solr.
When accurately being inquired using SQL, developer needs for each field configuration SQL statement and service logic, User in use, needs the field for searching the accurate typing of data, Stepwise Screening, can just show needs as a result, Such mode of operation is for user and unfriendly.The appearance of full-text search alleviates this problem, using segmenter by data It splits, the word of fractionation is stored indexed by corresponding field, then the text of user's input is segmented, calculate and word The relevance scores of section are presented according to score height as a result, such a mode greatly improves search efficiency.But full text is examined Rope also has its pros and cons, and the accuracy of retrieval is largely dependent upon the quality of word segmentation result, to specific industry data into When row retrieval, there are many fields that can not be segmented, such as the data of license plate, device id, phone number numbering type, state piece The proprietary vocabulary such as act, industry slang.It in this case, to be that establish full-text search engine not only will be every to business datum Table configures special dictionary, it is also necessary to as data variation constantly updates dictionary, bring inconvenience to the construction maintenance of system.
Summary of the invention
In view of the foregoing deficiencies of prior art, the present invention provides a kind of towards specific industry structuring business datum Text searching method and system realize the key to the business datum of structuring under the premise of not needing to update dictionary for word segmentation Full-text search can effectively promote the search efficiency of data.
In order to achieve the above objects and other related objects, the present invention provides one kind towards specific industry structuring business datum Text searching method, which includes:
Data are extracted into interim table from service database;
The data extracted from the service database are converted;
Inverted index is established for switched data;
The text of input is subjected to full-text search.
Optionally, if the data extracted are OLAP data, by the way of increment extraction from the service database Extract data;If the data extracted are OLTP data, number is extracted from the OLTP database by the way of full dose extraction According to.
Optionally, if the data extracted are OLAP data, which further includes duplicate removal step, specifically: it deletes Except the repeated data in the interim table.
Optionally, the text searching method further include:
The data of storage are synchronized in service database.
Optionally, described to establish inverted index for switched data, it specifically includes:
The field splicing for needing to retrieve is become into a big search field;
Word cutting is carried out to search field with segmenter and specified separator;
Establish corresponding index.
In order to achieve the above objects and other related objects, the present invention also provides one kind towards specific industry structuring business number According to text retrieval system, which includes:
Data extraction module, for extracting data into interim table from database;
Data conversion module, for converting the data extracted from the database;
Index establishes module, for establishing inverted index for switched data;
Retrieval module, the text for that will input carry out full-text search.
Optionally, if the data that data extraction module extracts are OLAP data, from described by the way of increment extraction Data are extracted in olap database;If the data that data extraction module extracts are OLTP data, by the way of full dose extraction Data are extracted from the OLTP database.
Optionally, if the data that data extraction module extracts are OLAP data, which further includes duplicate removal Module, for deleting the repeated data in the interim table.
Optionally, which further includes synchronization module, for the data of storage to be synchronized to olap database In.
Optionally, the index establishes module and includes:
Splice submodule, the field splicing for that will need to retrieve becomes a big search field;
Word cutting submodule, for carrying out word cutting to search field with segmenter and specified separator;
Setting up submodule is indexed, for establishing corresponding index.
As described above, a kind of text searching method and system towards specific industry structuring business datum of the invention, It has the advantages that
One, simplify user's operation.Compared to the mode for using SQL specific field to be inquired, this invention simplifies users Operation, reduce the skill requirement to user, user only needs the input content in a search box that can be inquired As a result.
Two, retrieval accuracy and efficiency is promoted.The present invention solves special number in service database, the words such as specialized vocabulary Section segments the problem of inaccuracy in full-text search, makes full-text search that can also obtain ideal knot in structuring business datum Fruit.The characteristics of based on inverted index, retrieval rate of the present invention are significantly faster than that structural data is in a manner of SQL in relevant database The speed of inquiry.
Detailed description of the invention
Fig. 1 is a kind of stream of the text searching method towards specific industry structuring business datum in one embodiment of the invention Cheng Tu;
Fig. 2 is the system framework that OLTP data establish index in one embodiment of the invention;
Fig. 3 is the system framework that OLAP data establishes index in one embodiment of the invention;
Fig. 4 is the flow chart of data deduplication in one embodiment of the invention;
Fig. 5 is the flow chart of data conversion in one embodiment of the invention;
Fig. 6 is the flow chart for indexing foundation in one embodiment of the invention in GPtext;
Fig. 7 is the schematic diagram retrieved in GPtext in one embodiment of the invention;
Fig. 8 is the data for establishing full-text search inverted index in one embodiment of the invention under greenplum+GPtext framework Structural schematic diagram;
Fig. 9 is the index Mapping Examples of GPtext in one embodiment of the invention;
Figure 10 is that the effect of business datum full-text search in one embodiment of the invention is shown;
Figure 11 is a kind of text retrieval system towards specific industry structuring business datum in one embodiment of the invention Block diagram;
Figure 12 is the block diagram that index establishes module in one embodiment of the invention.
Specific embodiment
Illustrate embodiments of the present invention below by way of specific specific example, those skilled in the art can be by this specification Other advantages and efficacy of the present invention can be easily understood for disclosed content.The present invention can also pass through in addition different specific realities The mode of applying is embodied or practiced, the various details in this specification can also based on different viewpoints and application, without departing from Various modifications or alterations are carried out under spirit of the invention.It should be noted that in the absence of conflict, following embodiment and implementation Feature in example can be combined with each other.
It should be noted that illustrating the basic structure that only the invention is illustrated in a schematic way provided in following embodiment Think, only shown in schema then with related component in the present invention rather than component count, shape and size when according to actual implementation Draw, when actual implementation kenel, quantity and the ratio of each component can arbitrarily change for one kind, and its assembly layout kenel It is likely more complexity.
As shown in Figure 1, a kind of text searching method towards specific industry structuring business datum, the text searching method The following steps are included:
Step S11. extracts data into interim table from service database;
Step S12. converts the data extracted from the service database;
Step S13. is that switched data establish inverted index;
The text of input is carried out full-text search by step S14..
In an embodiment, if the data extracted are OLAP data, from the OLAP number by the way of increment extraction According to extracting data in library;If the data extracted are OLTP data, by the way of full dose extraction from the OLTP database Extract data.
In an embodiment, if the data extracted are OLAP data, which further includes duplicate removal step, tool Body are as follows: delete the repeated data in the interim table.
In an embodiment, the text searching method further include:
The data of storage are synchronized in olap database.
It is described to establish inverted index for switched data in an embodiment, it specifically includes:
The field splicing for needing to retrieve is become a big search field by S131;By the step, reduce rope this Complexity.
S132 carries out word cutting to search field with segmenter and specified separator;By the step, solves irrelevant word Interference problem of the section to search result.
For example user thinks that retrieval includes the data of " in examination & approval " field, there is the result of field " in examination & approval " in the result of return Front should be arranged, but if inside other incoherent fields including " examination & approval ", " in " etc. wordings, indexed with segmenter When can also index for this data, this incoherent data can be made since what is matched often comes front. Therefore, it uses and irrelevant field can solve to search result to search field progress word cutting with segmenter and specified separator Interference problem.
It should be noted that specified separator can be ", " or " | " or space, the present embodiment is not limited in any way, Optional sign is ok.
S133 establishes corresponding index.
It is situated between below and text searching method of the invention is further explained based on greenplum database and GPtext It releases.
Wherein, GPtext is the full text inspection operated on greenplum cluster based on full-text search engine solr exploitation Rope plug-in unit, the creation of index are submitted, and the operation such as inquiry is executed in SQL statement by the function that GPtext is provided.
Step S110, in general, the storage service and full article retrieval of data are separated, therefore are storage Data establish full-text search before need the data for first synchronizing storage into full-text search engine.
Step S111 extracts data into the interim table of greenplum from service database.According to business datum classification Different there are two types of different system frameworks, as shown in Figure 2 and Figure 3.Establish index in Fig. 2 is the OLTP number in OLTP database According to the data handled in real time such as staff list, the needs such as stateful transaction.Because data constantly change, need this type number Interim table is drawn into according to full dose.Establish index in Fig. 3 is the OLAP data in olap database, as personnel enter and leave history note The data no longer changed after database are written in record, driving recording etc., and such data are that reference will count with an incremental Major key According to increment synchronization to interim table.
To the data of increment extraction, since there are section retards for OLAP data, need to repeat when extracting to extract upper primary Record point partial data forward, therefore be loaded onto the data of interim table there are redundancies, it is therefore desirable to the data of redundancy are gone It handles again.As shown in figure 4, specific De-weight method are as follows:
Read record of the log sheet last time to storage table deposit data;
According to the major key of load data last in log read storage table;
Read interim table data;
Delete major key and the duplicate data of storage table major key in interim table.
Step S112, such as Fig. 5, read the data of interim table in batches, handle the interim table data of reading line by line, according to matching Storage table is written in processed data by the strategy conversion set, data splitting, judge whether it is processed to interim table most end, If then recording the maximum value and minimum value for this time handling data major key, i.e. the major key range of storage table data in log sheet. At this point, index whether establish mark be set to it is no.If not handling interim table end, continue to read in batches in interim table Data.
Step S113, such as Fig. 6 read log sheet discovery without establishing the data indexed, judge whether that index has been established.Such as Fruit does not establish index, needs to establish manually, and Fig. 8 is the total signal of data to index.
Wherein, it indexes foundation specific step is as follows:
Judge whether configured index mapping, if the major key range for not setting up index in log sheet is then read, if otherwise Configuration index mapping, reads the major key range that index is not set up in log sheet;
The data of storage table, filling index are read according to the major key range recorded in log sheet;
Submit index;
The Index Status of relative recording in log sheet is updated to have been filed on.
Wherein, process of the data to index are as follows:
The field that will be retrieved first merges into a biggish search field;
Secondly biggish search field is segmented in a manner of separator participle and segmenter participle;
Then corresponding field index and participle index are established.
Wherein, system symbol partitive indicates one kind of separator.
GPtext by gptext_config management index mapping, need to establish for search field two kinds of indexes (including Field index and participle index), it is therefore desirable to a search field is replicated, index mapped file configuration is as shown in Figure 8.Index reflects After penetrating foundation, call the filling of gptext.index () function without establishing the data indexed in SQL statement, Gptext.commit_index () function submits index.By above step, index is had been established at this time, by rope in log sheet The mark whether established drawn being updated to is.
Step S114, such as Fig. 7, due to also to carry out one to field when carrying out index in classification to text in specific industry A little accurate lookups, therefore need to establish two secondary indexs as described in step before, it also therefore to execute and retrieve twice.
Specifically, executing retrieval twice includes:
Input retrieval content;
Retrieval content is passed in the index of text_sm type and is retrieved;
Retrieval content is passed in the index of text_intl type and is retrieved;
The comprehensive score descending arrangement retrieved twice;
The field stored in acquisition index, which is used as, to be returned the result.
In GPtext, the mode retrieved twice is as shown in Figure 10, meanwhile, Figure 10 also illustrates last accessed Search result.
It should be noted that text searching method of the present invention is not limited solely to greenplum+ described in example GPtext framework, other any kind of database+full-text search engines (such as: mysql+elasticsearch) are according to similar Method and steps is able to achieve the one-touch search to structuring business datum, all should be in the protection model being defined in the patent claims In enclosing.
As shown in figure 11, a kind of text retrieval system towards specific industry structuring business datum, the full-text search system System includes:
Data extraction module 11, for extracting data into interim table from database;
Data conversion module 12, for converting the data extracted from the database;
Index establishes module 13, for establishing inverted index for switched data;
Retrieval module 14, the text for that will input carry out full-text search.
Full-text search is carried out in a certain way when inputting text, can specifically be referred to by specified separator progress Retrieval.For example, just inputting " great favor temple white " when if it is that retrieval of space;If comma, retrieval just input " great favor temple, it is white Color "
In an embodiment, if the data that data extraction module extracts are OLAP data, by the way of increment extraction Data are extracted from the olap database;If the data that data extraction module extracts are OLTP data, extracted using full dose Mode extract data from the OLTP database.
In an embodiment, since there are section retards for OLAP data, need to repeat to extract upper primary record when extracting Point partial data forward, therefore be loaded onto the data of interim table there are redundancies, it is therefore desirable to the data of redundancy are carried out at duplicate removal Reason.Then the text retrieval system further includes deduplication module, for deleting the repeated data in the interim table.
In general, the storage service of data and full article retrieval are separated, therefore the data for storage are built Need the data for first synchronizing storage into full-text search engine before vertical full-text search.In an embodiment, the text retrieval system It further include synchronization module 15, for the data of storage to be synchronized in olap database.
In an embodiment, the index establishes module 13 and includes:
Splice submodule 131, the field splicing for that will need to retrieve becomes a big search field;
Word cutting submodule 132, for carrying out word cutting to search field with segmenter and specified separator;
Setting up submodule 133 is indexed, for establishing corresponding index.
It should be noted that the embodiment due to components of system as directed is corresponded to each other with the embodiment of method part, system The content of partial embodiment refers to the description of the embodiment of method part, wouldn't repeat here.
The present invention solves special number in service database, and the fields such as specialized vocabulary segment inaccuracy in full-text search The problem of, make full-text search that can also obtain ideal result in structuring business datum.
Such as Chinese word segmentation machine can only segment conventional word in participle, and such as: " in audit " is possible will Participle is " audit ", " in ".License plate, these segmenter of technical term are not divided, consequence be exactly in full-text search retrieval less than License plate, technical term." Chongqing A12345 Central Park bayonet " such data, in addition to dividing by segmenter participle, then by space Word is primary, can give " Chongqing A12345 " and " Central Park bayonet " sets up index.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.
In embodiment provided by the present invention, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The above-described embodiments merely illustrate the principles and effects of the present invention, and is not intended to limit the present invention.It is any ripe The personage for knowing this technology all without departing from the spirit and scope of the present invention, carries out modifications and changes to above-described embodiment.Cause This, institute is complete without departing from the spirit and technical ideas disclosed in the present invention by those of ordinary skill in the art such as At all equivalent modifications or change, should be covered by the claims of the present invention.

Claims (10)

1. a kind of text searching method towards specific industry structuring business datum, which is characterized in that the text searching method Include:
Data are extracted into interim table from database;
The data extracted from the database are converted;
Inverted index is established for switched data;
The text of input is subjected to full-text search.
2. a kind of text searching method towards specific industry structuring business datum according to claim 1, feature It is, if the data extracted are OLAP data, extracts data from the olap database by the way of increment extraction;If The data of extraction are OLTP data, then extract data from the OLTP database by the way of full dose extraction.
3. a kind of text searching method towards specific industry structuring business datum according to claim 2, feature It is, if the data extracted are OLAP data, which further includes duplicate removal step, specifically: it deletes described interim Repeated data in table.
4. a kind of text searching method towards specific industry structuring business datum according to claim 1, feature It is, the text searching method further include:
The data of storage are synchronized in service database.
5. a kind of text searching method towards specific industry structuring business datum according to claim 1, feature It is, it is described to establish inverted index for switched data, it specifically includes:
The field splicing for needing to retrieve is become into a big search field;
Word cutting is carried out to search field with segmenter and specified separator;
Establish corresponding index.
6. a kind of text retrieval system towards specific industry structuring business datum, which is characterized in that the text retrieval system Include:
Data extraction module, for extracting data into interim table from database;
Data conversion module, for converting the data extracted from the database;
Index establishes module, for establishing inverted index for switched data;
Retrieval module, the text for that will input carry out full-text search.
7. a kind of text retrieval system towards specific industry structuring business datum according to claim 1, feature It is, if the data extracted are OLAP data, extracts data from the olap database by the way of increment extraction;If The data of extraction are OLTP data, then extract data from the OLTP database by the way of full dose extraction.
8. a kind of text retrieval system towards specific industry structuring business datum according to claim 2, feature It is, if the data that data extraction module extracts are OLAP data, which further includes deduplication module, for deleting Except the repeated data in the interim table.
9. a kind of text retrieval system towards specific industry structuring business datum according to claim 1, feature It is, which further includes synchronization module, for the data of storage to be synchronized in service database.
10. a kind of text retrieval system towards specific industry structuring business datum according to claim 1, feature It is, the index establishes module and includes:
Splice submodule, the field splicing for that will need to retrieve becomes a big search field;
Word cutting submodule, for carrying out word cutting to search field with segmenter and specified separator;
Setting up submodule is indexed, for establishing corresponding index.
CN201910558557.XA 2019-06-26 2019-06-26 A kind of text searching method and system towards specific industry structuring business datum Pending CN110297829A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910558557.XA CN110297829A (en) 2019-06-26 2019-06-26 A kind of text searching method and system towards specific industry structuring business datum

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910558557.XA CN110297829A (en) 2019-06-26 2019-06-26 A kind of text searching method and system towards specific industry structuring business datum

Publications (1)

Publication Number Publication Date
CN110297829A true CN110297829A (en) 2019-10-01

Family

ID=68028827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910558557.XA Pending CN110297829A (en) 2019-06-26 2019-06-26 A kind of text searching method and system towards specific industry structuring business datum

Country Status (1)

Country Link
CN (1) CN110297829A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559550A (en) * 2020-10-30 2021-03-26 北京智源人工智能研究院 Multi-data-source NL2SQL system based on semantic rules and multi-dimensional model
CN116401259A (en) * 2023-06-08 2023-07-07 北京江融信科技有限公司 Automatic pre-creation index method and system for elastic search database

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161623A1 (en) * 2008-12-22 2010-06-24 Microsoft Corporation Inverted Index for Contextual Search
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
CN103823799A (en) * 2012-11-16 2014-05-28 镇江诺尼基智能技术有限公司 New-generation industry knowledge full-text search method
CN105335479A (en) * 2015-10-12 2016-02-17 国家计算机网络与信息安全管理中心 Text data statistics realization method based on SQL
CN108446323A (en) * 2018-02-11 2018-08-24 山东省农业信息中心 A kind of data retrieval method and device based on full-text search engine

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161623A1 (en) * 2008-12-22 2010-06-24 Microsoft Corporation Inverted Index for Contextual Search
CN101853288A (en) * 2010-05-19 2010-10-06 马晓普 Configurable full-text retrieval service system based on document real-time monitoring
CN103823799A (en) * 2012-11-16 2014-05-28 镇江诺尼基智能技术有限公司 New-generation industry knowledge full-text search method
CN105335479A (en) * 2015-10-12 2016-02-17 国家计算机网络与信息安全管理中心 Text data statistics realization method based on SQL
CN108446323A (en) * 2018-02-11 2018-08-24 山东省农业信息中心 A kind of data retrieval method and device based on full-text search engine

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
佳佳呀: "大数据导论(4)——OLTP与OLAP、数据库与数据仓库", 《HTTPS://WWW.CNBLOGS.COM/NOVEMBERRAIN/P/9896064.HTML》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112559550A (en) * 2020-10-30 2021-03-26 北京智源人工智能研究院 Multi-data-source NL2SQL system based on semantic rules and multi-dimensional model
CN116401259A (en) * 2023-06-08 2023-07-07 北京江融信科技有限公司 Automatic pre-creation index method and system for elastic search database
CN116401259B (en) * 2023-06-08 2023-08-22 北京江融信科技有限公司 Automatic pre-creation index method and system for elastic search database

Similar Documents

Publication Publication Date Title
CN111753099B (en) Method and system for enhancing relevance of archive entity based on knowledge graph
CN107491561B (en) Ontology-based urban traffic heterogeneous data integration system and method
CN104794247B (en) A kind of more structural databases integrate querying method
CN111597308A (en) Knowledge graph-based voice question-answering system and application method thereof
CN103514201B (en) Method and device for querying data in non-relational database
CN108446368A (en) A kind of construction method and equipment of Packaging Industry big data knowledge mapping
CN109992645A (en) A kind of data supervision system and method based on text data
CN102982076A (en) Multi-dimensionality content labeling method based on semanteme label database
CN105608232A (en) Bug knowledge modeling method based on graphic database
CN113190687B (en) Knowledge graph determining method and device, computer equipment and storage medium
CN106599052A (en) Data query system based on ApacheKylin, and method thereof
CN113157860B (en) Electric power equipment maintenance knowledge graph construction method based on small-scale data
CN108304382A (en) Mass analysis method based on manufacturing process text data digging and system
CN110297829A (en) A kind of text searching method and system towards specific industry structuring business datum
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
Yang et al. Database Semantic Interoperability based on Information Flow Theory and Formal Concept Analysis
CN116842142B (en) Intelligent retrieval system for medical instrument
CN109446277A (en) Relational data intelligent search method and system based on Chinese natural language
CN105824956A (en) Inverted index model based on link list structure and construction method of inverted index model
CN106611016A (en) Image retrieval method based on decomposable word pack model
CN106649599A (en) Knowledge service oriented scientific research data processing and predictive analysis platform
CN102043849B (en) Realization method for electronic dictionary system with ideographic components as elements
CN105512270A (en) Method and device for determining related objects
Chang et al. Mining semantics for large scale integration on the web: evidences, insights, and challenges
CN114691878A (en) Construction method of automobile standard knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191001