CN107169118A - A kind of quick urban information searching system - Google Patents
A kind of quick urban information searching system Download PDFInfo
- Publication number
- CN107169118A CN107169118A CN201710380733.6A CN201710380733A CN107169118A CN 107169118 A CN107169118 A CN 107169118A CN 201710380733 A CN201710380733 A CN 201710380733A CN 107169118 A CN107169118 A CN 107169118A
- Authority
- CN
- China
- Prior art keywords
- word
- mrow
- information
- field concept
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9537—Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Machine Translation (AREA)
Abstract
The invention provides a kind of quick urban information searching system, including mobile terminal and Urban Data center, the mobile terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to carry out corresponding realm information retrieval according to information request and feeds back to mobile terminal.Beneficial effects of the present invention are:Realize the quick-searching of urban information.
Description
Technical field
The present invention relates to technical field of information retrieval, and in particular to a kind of quick urban information searching system.
Background technology
When people go a city tour, it is desirable to clear local various information, how from magnanimity information quick-searching
Going out the information wanted to know about turns into pendulum problem in face of people.
The content of the invention
In view of the above-mentioned problems, a kind of the present invention is intended to provide quick urban information searching system.
The purpose of the present invention is realized using following technical scheme:
There is provided a kind of quick urban information searching system, including mobile terminal and Urban Data center, the movement
Terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to carry out phase according to information request
Answer realm information to retrieve and feed back to mobile terminal.
Beneficial effects of the present invention are:Realize the quick-searching of urban information.
Brief description of the drawings
Using accompanying drawing, the invention will be further described, but the embodiment in accompanying drawing does not constitute any limit to the present invention
System, for one of ordinary skill in the art, on the premise of not paying creative work, can also be obtained according to the following drawings
Other accompanying drawings.
Fig. 1 is the structural representation of the present invention;
Reference:
Mobile terminal 1, Urban Data center 2.
Embodiment
The invention will be further described with the following Examples.
Referring to Fig. 1, a kind of quick urban information searching system of the present embodiment, including mobile terminal 1 and Urban Data
Center 2, the mobile terminal 1 is used to send realm information request to Urban Data center 2, and the Urban Data center 2 is used for
Corresponding realm information retrieval is carried out according to information request and mobile terminal 1 is fed back to.
The present embodiment realizes the quick-searching of urban information.
It is preferred that, the mobile terminal 1 includes mobile phone and tablet personal computer.
This preferred embodiment provides much information searching terminal.
It is preferred that, the mobile terminal 1 includes d GPS locating module.
This preferred embodiment has provided the user positioning function.
It is preferred that, the Urban Data center 2 includes message entry subsystem, field concept and obtains subsystem and information inspection
Large rope system, described information input subsystem is used for the realm information request for inputting user's transmission, and the field concept obtains son
System is used to obtain corresponding field concept from corpus, and described information retrieval subsystem is used to carry out phase according to field concept
The information retrieval answered.
Described information input subsystem includes voice input module and text input module, and the voice input module is used for
The voice messaging of identified input, the text input module is used for the text message of identified input, the voice input module bag
Voice messaging collecting unit, voice messaging memory cell, transmission of speech information unit, audio text converting unit and text is included to know
Other unit, the voice messaging collecting unit is used to gather voice messaging, and the voice messaging memory cell, which is used to store, to be gathered
The voice messaging arrived, the transmission of speech information unit is used for the transmission of speech information of storage to audio text converting unit,
The audio text converting unit is used to acoustic information being converted into text message, and the text identification unit is used for text envelope
Breath is identified;The text input module includes text message input block, text information storage unit, text message and read
Unit, communication unit and text message recognition unit, the text message input block are used for manual text writing information, described
Text information storage unit is used for the text message for storing write-in, and the text message read unit is used for the text for reading storage
Information, the communication unit is used to transmit the text message read out to text message recognition unit, and the text message is known
Other unit is used to the text message of reception is identified.
This preferred embodiment realizes the phonetic entry and handwriting input of information.
It is preferred that, the field concept, which obtains subsystem, includes first set generation module and the second Concept acquisition module,
The first set generation module is used to generate set of words according to corpus, and the second Concept acquisition module is used for according to word
Language set obtains field concept;The generation set of words is carried out using following steps:Step 1, in units of sentence to corpus
Participle is carried out, stop words is removed first, set of words W is generated, polynary phrase is then extracted from set of words W, candidate is obtained
Set of words WL;If step 2, WL are not sky, character string w is taken, if w meets compound word decision condition, w is regard as compound word
Language adds compound set of words CW=CW ∪ w, wherein, w ∈ WL,;Step 3, output set of words W=W ∪ CW.Using following step
Suddenly the compound word decision condition is determined:Step 1, set character string w=s1s2…sn, s1, s2,…,snFor to being obtained after its participle
The word arrived, s is represented with YW1, s2,…,snMutual information index: In above-mentioned formula, P (s1) it is word s1The probability of appearance, P (s2) it is word s2What is occurred is general
Rate ..., P (sn) it is word snThe probability of appearance, P (s1, s2,…,sn) it is word s1, s2,…,snOccur jointly in corpus
Probability, wherein,Its
In, F (s1) it is to include word s1Sentence quantity, F (s2) it is to include word s2Sentence quantity ..., F (sn) it is to include word
Language snSentence quantity, F (s1, s2,…,sn) it is to include word s1, s2,…,snSentence quantity, F represents that sentence is total
Number;If step 2, meeting YW (s1, s2,…,sn) > YW1, it is determined that w is compound word, wherein, YW1For given threshold value.
This preferred embodiment urban information searching system is carried out before field concept is obtained to the compound word in corpus
Extract, overcome the defect that compound word is not accounted in conventional field concept acquisition process, can prevent from waiting in follow-up choose
Select in concept process and to screen out the field concept of compound word, during concrete operations, it is proposed that brand-new compound word judges bar
Part, obtains accurately compound set of words, so that urban information searching system can retrieve more comprehensively realm information.
It is preferred that, obtain the field concept and carried out using following steps:Step 1, the selection from domain knowledge base in advance
Field concept is used as initial field concept set DC;Step 2, for the word s in set of words W, using cosine similarity meter
Its semantic similarity CS (s, DC) is calculated, if meeting CS (s, DC) > CS1, wherein, CS1For given threshold value, then s is added into field general
Read in set, the field concept set once updated, and s is removed from W set, the set of words updated;Step
Rapid 3, choose word s one by one from the set of words of renewal, if meeting candidate concepts decision condition, s is added into candidate concepts
In set CC;Step 4, each candidate concepts s in candidate concepts set CC is evaluated, obtain each candidate concepts s and comment
ZC is worth, the maximum preceding EM word of evaluation of estimate is chosen and is added to the field concept set once updated, obtain final field
Concept set, wherein EM ∈ [5,9].The candidate concepts decision condition is determined using following steps:Step 1, in corpus
Sentence quantity comprising word s is calculated, and the sentence quantity is equal to the sentence quantity for each word for constituting this word
Sum:In above-mentioned formula, n represents the number for the word that s is included, siRepresent i-th of word that s is included, F
(s) the sentence quantity for including word s in corpus is represented;Word s and initial field concept set in step 2, calculating corpus
In the sentence quantity that occurs jointly of any field concept:In above-mentioned formula, dc represents initial
Any field concept in field concept set, F (s, DC) represents that word s and any field in initial field concept set are general
Read the sentence quantity occurred jointly;Step 3, the candidate value FS using following formula calculating word s:
In above-mentioned formula, Fmax(s, DC) represents the sentence number that word s occurs jointly with a certain field concept in initial field concept set
The maximum of amount;If FS > FS1, wherein, FS1For given threshold, then word s is candidate concepts.Determined using below equation described
Evaluation of estimate ZC:
This preferred embodiment urban information searching system is during field concept is obtained, it is contemplated that the language of field concept
Justice is similar, it is to avoid miss semantic similar field concept using statistical method, first during specifically field concept is determined
Candidate concepts are first determined, it is then determined that field concept, the field concept of acquisition more meets domain features, so that urban information is examined
Cable system can retrieve more accurate realm information.
Information retrieval is carried out using quickly urban information searching system of the invention, when EH takes different value, information examined
Rope accuracy and Information Retrieval Efficiency are counted, with not using compared with the present invention, and generation is had the beneficial effect that shown in table:
EM | Information retrieval accuracy is improved | Information Retrieval Efficiency is improved |
5 | 32% | 31% |
6 | 27% | 24% |
7 | 25% | 20% |
8 | 20% | 16% |
9 | 18% | 15% |
Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than to present invention guarantor
The limitation of scope is protected, although being explained with reference to preferred embodiment to the present invention, one of ordinary skill in the art should
Work as understanding, technical scheme can be modified or equivalent substitution, without departing from the reality of technical solution of the present invention
Matter and scope.
Claims (8)
1. a kind of quick urban information searching system, it is characterised in that including mobile terminal and Urban Data center, the shifting
Dynamic terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to be carried out according to information request
Corresponding realm information is retrieved and feeds back to mobile terminal.
2. quick urban information searching system according to claim 1, it is characterised in that the mobile terminal includes hand
Machine and tablet personal computer.
3. quick urban information searching system according to claim 2, it is characterised in that the mobile terminal includes
D GPS locating module.
4. quick urban information searching system according to claim 3, it is characterised in that pericardium in the Urban Data
Include message entry subsystem, field concept and obtain subsystem and information retrieval subsystem, described information input subsystem is used for defeated
The realm information request that access customer is sent, it is general for obtaining corresponding field from corpus that the field concept obtains subsystem
Read, described information retrieval subsystem is used to carry out corresponding information retrieval according to field concept.
5. quick urban information searching system according to claim 4, it is characterised in that described information input subsystem
Including voice input module and text input module, the voice input module is used for the voice messaging of identified input, the text
This input module is used for the text message of identified input, and the voice input module, which includes voice messaging collecting unit, voice, to be believed
Memory cell, transmission of speech information unit, audio text converting unit and text identification unit are ceased, the voice messaging collection is single
Member is used to gather voice messaging, and the voice messaging memory cell is used to store the voice messaging collected, the voice messaging
Transmission unit is used for the transmission of speech information of storage to audio text converting unit, and the audio text converting unit is used for will
Acoustic information is converted into text message, and the text identification unit is used to text message is identified;The text input mould
Block, which includes text message input block, text information storage unit, text message read unit, communication unit and text message, to be known
Other unit, the text message input block is used for manual text writing information, and the text information storage unit is used to store
The text message of write-in, the text message read unit is used for the text message for reading storage, and the communication unit is used for will
The text message read out is transmitted to text message recognition unit, and the text message recognition unit is used for the text envelope to reception
Breath is identified.
6. quick urban information searching system according to claim 5, it is characterised in that the field concept obtains son
System includes first set generation module and the second Concept acquisition module, and the first set generation module is used for according to corpus
Set of words is generated, the second Concept acquisition module is used to obtain field concept according to set of words;The generation word collection
Close and carried out using following steps:Step 1, in units of sentence participle is carried out to corpus, remove stop words first, generate word collection
W is closed, polynary phrase is then extracted from set of words W, candidate word set WL is obtained;If step 2, WL are not sky, character is taken
String w, if w meets compound word decision condition, compound set of words CW=CW ∪ w are added using w as compound word, wherein, w
∈ WL,;Step 3, output set of words W=W ∪ CW;The compound word decision condition is determined using following steps:Step 1,
If character string w=s1s2…sn, s1,s2,…,snFor to the word obtained after its participle, s is represented with YW1,s2,…,snMutual trust
Cease index:In above-mentioned formula, P (s1) be
Word s1The probability of appearance, P (s2) it is word s2The probability ... of appearance, P (sn) it is word snThe probability of appearance, P (s1,s2,…,
sn) it is word s1,s2,…,snThe probability occurred jointly in corpus, wherein, Wherein, F (s1) it is to include word s1's
The quantity of sentence, F (s2) it is to include word s2Sentence quantity ..., F (sn) it is to include word snSentence quantity, F (s1,
s2,…,sn) it is to include word s1,s2,…,snSentence quantity, F represent sentence sum;If step 2, meeting YW (s1,
s2,…,sn) > YW1, it is determined that w is compound word, wherein, YW1For given threshold value.
7. quick urban information searching system according to claim 6, it is characterised in that obtain the field concept and adopt
Carried out with following steps:Step 1, advance field concept of being chosen from domain knowledge base are used as initial field concept set DC;Step
Rapid 2, for the word s in set of words W, its semantic similarity CS (s, DC) is calculated using cosine similarity, if meet CS (s,
DC) > CS1, wherein, CS1For given threshold value, then s is added in field concept set, the field concept collection once updated
Close, and s is removed from W set, the set of words updated;Step 3, choose word one by one from the set of words of renewal
S, if meeting candidate concepts decision condition, s is added in candidate concepts set CC;Step 4, in candidate concepts set CC
Each candidate concepts s is evaluated, and obtains each candidate concepts s evaluation of estimate ZC, is chosen the maximum preceding EM word of evaluation of estimate and is added
Enter to the field concept set once updated, obtain final field concept set, wherein EM ∈ [5,9].
8. quick urban information searching system according to claim 7, it is characterised in that determine institute using following steps
State candidate concepts decision condition:Step 1, to being calculated in corpus comprising word s sentence quantity, described sentence quantity etc.
In the sentence quantity sum for each word for constituting this word:In above-mentioned formula, n represents that s is included
Word number, siI-th of word that s is included is represented, F (s) represents the sentence quantity for including word s in corpus;Step
2nd, the sentence quantity that word s occurs jointly with any field concept in initial field concept set in corpus is calculated:In above-mentioned formula, dc represents any field concept in initial field concept set, F (s,
DC) the sentence quantity that word s occurs jointly with any field concept in initial field concept set is represented;Step 3, under
Formula calculates word s candidate value FS:In above-mentioned formula, Fmax(s, DC) represent word s with
The maximum for the sentence quantity that a certain field concept occurs jointly in initial field concept set;If FS > FS1, wherein, FS1For
Given threshold, then word s is candidate concepts;Institute evaluation values ZC is determined using below equation:
<mrow>
<mi>Z</mi>
<mi>C</mi>
<mo>=</mo>
<mn>1</mn>
<mo>+</mo>
<msub>
<mi>log</mi>
<mn>2</mn>
</msub>
<mi>F</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>,</mo>
<mi>D</mi>
<mi>C</mi>
<mo>)</mo>
</mrow>
<mo>&times;</mo>
<mfrac>
<mrow>
<mi>F</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>,</mo>
<mi>D</mi>
<mi>C</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mi>F</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
</mrow>
</mfrac>
<mo>+</mo>
<mfrac>
<mrow>
<mi>F</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>,</mo>
<mi>D</mi>
<mi>C</mi>
<mo>)</mo>
</mrow>
</mrow>
<mrow>
<mi>F</mi>
<mrow>
<mo>(</mo>
<mi>s</mi>
<mo>)</mo>
</mrow>
</mrow>
</mfrac>
<mo>.</mo>
</mrow>
2
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710380733.6A CN107169118A (en) | 2017-05-25 | 2017-05-25 | A kind of quick urban information searching system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710380733.6A CN107169118A (en) | 2017-05-25 | 2017-05-25 | A kind of quick urban information searching system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107169118A true CN107169118A (en) | 2017-09-15 |
Family
ID=59820704
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710380733.6A Pending CN107169118A (en) | 2017-05-25 | 2017-05-25 | A kind of quick urban information searching system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107169118A (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090210411A1 (en) * | 2008-02-15 | 2009-08-20 | Oki Electric Industry Co., Ltd. | Information Retrieving System |
CN106202514A (en) * | 2016-07-21 | 2016-12-07 | 北京邮电大学 | Accident based on Agent is across the search method of media information and system |
-
2017
- 2017-05-25 CN CN201710380733.6A patent/CN107169118A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090210411A1 (en) * | 2008-02-15 | 2009-08-20 | Oki Electric Industry Co., Ltd. | Information Retrieving System |
CN106202514A (en) * | 2016-07-21 | 2016-12-07 | 北京邮电大学 | Accident based on Agent is across the search method of media information and system |
Non-Patent Citations (1)
Title |
---|
杨月华: "基于领域知识模型的突发事件智能信息检索系统研究", 《中国博士学位论文全文数据库(电子期刊)》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102196400B1 (en) | Determining hotword suitability | |
CN104078044B (en) | The method and apparatus of mobile terminal and recording search thereof | |
CN110019732B (en) | Intelligent question answering method and related device | |
CN100545847C (en) | A kind of method and system that blog articles is sorted | |
CN109508458B (en) | Legal entity identification method and device | |
CN108536807B (en) | Information processing method and device | |
CN101566998A (en) | Chinese question-answering system based on neural network | |
CN103794211B (en) | A kind of audio recognition method and system | |
CN102246169A (en) | Assigning an indexing weight to a search term | |
CN112560450A (en) | Text error correction method and device | |
CN103871402A (en) | Language model training system, a voice identification system and corresponding method | |
CN105791446A (en) | Method, device and system for processing private lending | |
CN110705292A (en) | Entity name extraction method based on knowledge base and deep learning | |
CN116150651A (en) | AI-based depth synthesis detection method and system | |
US20230054726A1 (en) | Query-focused extractive text summarization of textual data | |
CN107169118A (en) | A kind of quick urban information searching system | |
TW201407390A (en) | Data clustering apparatus and method | |
CN103247316A (en) | Method and system for constructing index in voice frequency retrieval | |
CN115331675A (en) | Method and device for processing user voice | |
CN111177585A (en) | Map POI feedback method and device | |
CN112115237B (en) | Construction method and device of tobacco science and technology literature data recommendation model | |
CN111368028B (en) | Method and device for recommending question respondents | |
CN107193802A (en) | A kind of smart field concept auto acquisition system | |
CN108376365B (en) | Bank number determining method and device | |
CN107205029A (en) | A kind of efficient electronic burst event management system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170915 |