CN107169118A - A kind of quick urban information searching system - Google Patents

A kind of quick urban information searching system Download PDF

Info

Publication number
CN107169118A
CN107169118A CN201710380733.6A CN201710380733A CN107169118A CN 107169118 A CN107169118 A CN 107169118A CN 201710380733 A CN201710380733 A CN 201710380733A CN 107169118 A CN107169118 A CN 107169118A
Authority
CN
China
Prior art keywords
word
mrow
information
field concept
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710380733.6A
Other languages
Chinese (zh)
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mdt Infotech Ltd Of Shanghai Zhe
Original Assignee
Mdt Infotech Ltd Of Shanghai Zhe
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mdt Infotech Ltd Of Shanghai Zhe filed Critical Mdt Infotech Ltd Of Shanghai Zhe
Priority to CN201710380733.6A priority Critical patent/CN107169118A/en
Publication of CN107169118A publication Critical patent/CN107169118A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a kind of quick urban information searching system, including mobile terminal and Urban Data center, the mobile terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to carry out corresponding realm information retrieval according to information request and feeds back to mobile terminal.Beneficial effects of the present invention are:Realize the quick-searching of urban information.

Description

A kind of quick urban information searching system
Technical field
The present invention relates to technical field of information retrieval, and in particular to a kind of quick urban information searching system.
Background technology
When people go a city tour, it is desirable to clear local various information, how from magnanimity information quick-searching Going out the information wanted to know about turns into pendulum problem in face of people.
The content of the invention
In view of the above-mentioned problems, a kind of the present invention is intended to provide quick urban information searching system.
The purpose of the present invention is realized using following technical scheme:
There is provided a kind of quick urban information searching system, including mobile terminal and Urban Data center, the movement Terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to carry out phase according to information request Answer realm information to retrieve and feed back to mobile terminal.
Beneficial effects of the present invention are:Realize the quick-searching of urban information.
Brief description of the drawings
Using accompanying drawing, the invention will be further described, but the embodiment in accompanying drawing does not constitute any limit to the present invention System, for one of ordinary skill in the art, on the premise of not paying creative work, can also be obtained according to the following drawings Other accompanying drawings.
Fig. 1 is the structural representation of the present invention;
Reference:
Mobile terminal 1, Urban Data center 2.
Embodiment
The invention will be further described with the following Examples.
Referring to Fig. 1, a kind of quick urban information searching system of the present embodiment, including mobile terminal 1 and Urban Data Center 2, the mobile terminal 1 is used to send realm information request to Urban Data center 2, and the Urban Data center 2 is used for Corresponding realm information retrieval is carried out according to information request and mobile terminal 1 is fed back to.
The present embodiment realizes the quick-searching of urban information.
It is preferred that, the mobile terminal 1 includes mobile phone and tablet personal computer.
This preferred embodiment provides much information searching terminal.
It is preferred that, the mobile terminal 1 includes d GPS locating module.
This preferred embodiment has provided the user positioning function.
It is preferred that, the Urban Data center 2 includes message entry subsystem, field concept and obtains subsystem and information inspection Large rope system, described information input subsystem is used for the realm information request for inputting user's transmission, and the field concept obtains son System is used to obtain corresponding field concept from corpus, and described information retrieval subsystem is used to carry out phase according to field concept The information retrieval answered.
Described information input subsystem includes voice input module and text input module, and the voice input module is used for The voice messaging of identified input, the text input module is used for the text message of identified input, the voice input module bag Voice messaging collecting unit, voice messaging memory cell, transmission of speech information unit, audio text converting unit and text is included to know Other unit, the voice messaging collecting unit is used to gather voice messaging, and the voice messaging memory cell, which is used to store, to be gathered The voice messaging arrived, the transmission of speech information unit is used for the transmission of speech information of storage to audio text converting unit, The audio text converting unit is used to acoustic information being converted into text message, and the text identification unit is used for text envelope Breath is identified;The text input module includes text message input block, text information storage unit, text message and read Unit, communication unit and text message recognition unit, the text message input block are used for manual text writing information, described Text information storage unit is used for the text message for storing write-in, and the text message read unit is used for the text for reading storage Information, the communication unit is used to transmit the text message read out to text message recognition unit, and the text message is known Other unit is used to the text message of reception is identified.
This preferred embodiment realizes the phonetic entry and handwriting input of information.
It is preferred that, the field concept, which obtains subsystem, includes first set generation module and the second Concept acquisition module, The first set generation module is used to generate set of words according to corpus, and the second Concept acquisition module is used for according to word Language set obtains field concept;The generation set of words is carried out using following steps:Step 1, in units of sentence to corpus Participle is carried out, stop words is removed first, set of words W is generated, polynary phrase is then extracted from set of words W, candidate is obtained Set of words WL;If step 2, WL are not sky, character string w is taken, if w meets compound word decision condition, w is regard as compound word Language adds compound set of words CW=CW ∪ w, wherein, w ∈ WL,;Step 3, output set of words W=W ∪ CW.Using following step Suddenly the compound word decision condition is determined:Step 1, set character string w=s1s2…sn, s1, s2,…,snFor to being obtained after its participle The word arrived, s is represented with YW1, s2,…,snMutual information index: In above-mentioned formula, P (s1) it is word s1The probability of appearance, P (s2) it is word s2What is occurred is general Rate ..., P (sn) it is word snThe probability of appearance, P (s1, s2,…,sn) it is word s1, s2,…,snOccur jointly in corpus Probability, wherein,Its In, F (s1) it is to include word s1Sentence quantity, F (s2) it is to include word s2Sentence quantity ..., F (sn) it is to include word Language snSentence quantity, F (s1, s2,…,sn) it is to include word s1, s2,…,snSentence quantity, F represents that sentence is total Number;If step 2, meeting YW (s1, s2,…,sn) > YW1, it is determined that w is compound word, wherein, YW1For given threshold value.
This preferred embodiment urban information searching system is carried out before field concept is obtained to the compound word in corpus Extract, overcome the defect that compound word is not accounted in conventional field concept acquisition process, can prevent from waiting in follow-up choose Select in concept process and to screen out the field concept of compound word, during concrete operations, it is proposed that brand-new compound word judges bar Part, obtains accurately compound set of words, so that urban information searching system can retrieve more comprehensively realm information.
It is preferred that, obtain the field concept and carried out using following steps:Step 1, the selection from domain knowledge base in advance Field concept is used as initial field concept set DC;Step 2, for the word s in set of words W, using cosine similarity meter Its semantic similarity CS (s, DC) is calculated, if meeting CS (s, DC) > CS1, wherein, CS1For given threshold value, then s is added into field general Read in set, the field concept set once updated, and s is removed from W set, the set of words updated;Step Rapid 3, choose word s one by one from the set of words of renewal, if meeting candidate concepts decision condition, s is added into candidate concepts In set CC;Step 4, each candidate concepts s in candidate concepts set CC is evaluated, obtain each candidate concepts s and comment ZC is worth, the maximum preceding EM word of evaluation of estimate is chosen and is added to the field concept set once updated, obtain final field Concept set, wherein EM ∈ [5,9].The candidate concepts decision condition is determined using following steps:Step 1, in corpus Sentence quantity comprising word s is calculated, and the sentence quantity is equal to the sentence quantity for each word for constituting this word Sum:In above-mentioned formula, n represents the number for the word that s is included, siRepresent i-th of word that s is included, F (s) the sentence quantity for including word s in corpus is represented;Word s and initial field concept set in step 2, calculating corpus In the sentence quantity that occurs jointly of any field concept:In above-mentioned formula, dc represents initial Any field concept in field concept set, F (s, DC) represents that word s and any field in initial field concept set are general Read the sentence quantity occurred jointly;Step 3, the candidate value FS using following formula calculating word s: In above-mentioned formula, Fmax(s, DC) represents the sentence number that word s occurs jointly with a certain field concept in initial field concept set The maximum of amount;If FS > FS1, wherein, FS1For given threshold, then word s is candidate concepts.Determined using below equation described Evaluation of estimate ZC:
This preferred embodiment urban information searching system is during field concept is obtained, it is contemplated that the language of field concept Justice is similar, it is to avoid miss semantic similar field concept using statistical method, first during specifically field concept is determined Candidate concepts are first determined, it is then determined that field concept, the field concept of acquisition more meets domain features, so that urban information is examined Cable system can retrieve more accurate realm information.
Information retrieval is carried out using quickly urban information searching system of the invention, when EH takes different value, information examined Rope accuracy and Information Retrieval Efficiency are counted, with not using compared with the present invention, and generation is had the beneficial effect that shown in table:
EM Information retrieval accuracy is improved Information Retrieval Efficiency is improved
5 32% 31%
6 27% 24%
7 25% 20%
8 20% 16%
9 18% 15%
Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, rather than to present invention guarantor The limitation of scope is protected, although being explained with reference to preferred embodiment to the present invention, one of ordinary skill in the art should Work as understanding, technical scheme can be modified or equivalent substitution, without departing from the reality of technical solution of the present invention Matter and scope.

Claims (8)

1. a kind of quick urban information searching system, it is characterised in that including mobile terminal and Urban Data center, the shifting Dynamic terminal is used to send realm information request to Urban Data center, and the Urban Data center is used to be carried out according to information request Corresponding realm information is retrieved and feeds back to mobile terminal.
2. quick urban information searching system according to claim 1, it is characterised in that the mobile terminal includes hand Machine and tablet personal computer.
3. quick urban information searching system according to claim 2, it is characterised in that the mobile terminal includes D GPS locating module.
4. quick urban information searching system according to claim 3, it is characterised in that pericardium in the Urban Data Include message entry subsystem, field concept and obtain subsystem and information retrieval subsystem, described information input subsystem is used for defeated The realm information request that access customer is sent, it is general for obtaining corresponding field from corpus that the field concept obtains subsystem Read, described information retrieval subsystem is used to carry out corresponding information retrieval according to field concept.
5. quick urban information searching system according to claim 4, it is characterised in that described information input subsystem Including voice input module and text input module, the voice input module is used for the voice messaging of identified input, the text This input module is used for the text message of identified input, and the voice input module, which includes voice messaging collecting unit, voice, to be believed Memory cell, transmission of speech information unit, audio text converting unit and text identification unit are ceased, the voice messaging collection is single Member is used to gather voice messaging, and the voice messaging memory cell is used to store the voice messaging collected, the voice messaging Transmission unit is used for the transmission of speech information of storage to audio text converting unit, and the audio text converting unit is used for will Acoustic information is converted into text message, and the text identification unit is used to text message is identified;The text input mould Block, which includes text message input block, text information storage unit, text message read unit, communication unit and text message, to be known Other unit, the text message input block is used for manual text writing information, and the text information storage unit is used to store The text message of write-in, the text message read unit is used for the text message for reading storage, and the communication unit is used for will The text message read out is transmitted to text message recognition unit, and the text message recognition unit is used for the text envelope to reception Breath is identified.
6. quick urban information searching system according to claim 5, it is characterised in that the field concept obtains son System includes first set generation module and the second Concept acquisition module, and the first set generation module is used for according to corpus Set of words is generated, the second Concept acquisition module is used to obtain field concept according to set of words;The generation word collection Close and carried out using following steps:Step 1, in units of sentence participle is carried out to corpus, remove stop words first, generate word collection W is closed, polynary phrase is then extracted from set of words W, candidate word set WL is obtained;If step 2, WL are not sky, character is taken String w, if w meets compound word decision condition, compound set of words CW=CW ∪ w are added using w as compound word, wherein, w ∈ WL,;Step 3, output set of words W=W ∪ CW;The compound word decision condition is determined using following steps:Step 1, If character string w=s1s2…sn, s1,s2,…,snFor to the word obtained after its participle, s is represented with YW1,s2,…,snMutual trust Cease index:In above-mentioned formula, P (s1) be Word s1The probability of appearance, P (s2) it is word s2The probability ... of appearance, P (sn) it is word snThe probability of appearance, P (s1,s2,…, sn) it is word s1,s2,…,snThe probability occurred jointly in corpus, wherein, Wherein, F (s1) it is to include word s1's The quantity of sentence, F (s2) it is to include word s2Sentence quantity ..., F (sn) it is to include word snSentence quantity, F (s1, s2,…,sn) it is to include word s1,s2,…,snSentence quantity, F represent sentence sum;If step 2, meeting YW (s1, s2,…,sn) > YW1, it is determined that w is compound word, wherein, YW1For given threshold value.
7. quick urban information searching system according to claim 6, it is characterised in that obtain the field concept and adopt Carried out with following steps:Step 1, advance field concept of being chosen from domain knowledge base are used as initial field concept set DC;Step Rapid 2, for the word s in set of words W, its semantic similarity CS (s, DC) is calculated using cosine similarity, if meet CS (s, DC) > CS1, wherein, CS1For given threshold value, then s is added in field concept set, the field concept collection once updated Close, and s is removed from W set, the set of words updated;Step 3, choose word one by one from the set of words of renewal S, if meeting candidate concepts decision condition, s is added in candidate concepts set CC;Step 4, in candidate concepts set CC Each candidate concepts s is evaluated, and obtains each candidate concepts s evaluation of estimate ZC, is chosen the maximum preceding EM word of evaluation of estimate and is added Enter to the field concept set once updated, obtain final field concept set, wherein EM ∈ [5,9].
8. quick urban information searching system according to claim 7, it is characterised in that determine institute using following steps State candidate concepts decision condition:Step 1, to being calculated in corpus comprising word s sentence quantity, described sentence quantity etc. In the sentence quantity sum for each word for constituting this word:In above-mentioned formula, n represents that s is included Word number, siI-th of word that s is included is represented, F (s) represents the sentence quantity for including word s in corpus;Step 2nd, the sentence quantity that word s occurs jointly with any field concept in initial field concept set in corpus is calculated:In above-mentioned formula, dc represents any field concept in initial field concept set, F (s, DC) the sentence quantity that word s occurs jointly with any field concept in initial field concept set is represented;Step 3, under Formula calculates word s candidate value FS:In above-mentioned formula, Fmax(s, DC) represent word s with The maximum for the sentence quantity that a certain field concept occurs jointly in initial field concept set;If FS > FS1, wherein, FS1For Given threshold, then word s is candidate concepts;Institute evaluation values ZC is determined using below equation:
<mrow> <mi>Z</mi> <mi>C</mi> <mo>=</mo> <mn>1</mn> <mo>+</mo> <msub> <mi>log</mi> <mn>2</mn> </msub> <mi>F</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <mi>D</mi> <mi>C</mi> <mo>)</mo> </mrow> <mo>&amp;times;</mo> <mfrac> <mrow> <mi>F</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <mi>D</mi> <mi>C</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>F</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>+</mo> <mfrac> <mrow> <mi>F</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>,</mo> <mi>D</mi> <mi>C</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mi>F</mi> <mrow> <mo>(</mo> <mi>s</mi> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>.</mo> </mrow> 2
CN201710380733.6A 2017-05-25 2017-05-25 A kind of quick urban information searching system Pending CN107169118A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710380733.6A CN107169118A (en) 2017-05-25 2017-05-25 A kind of quick urban information searching system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710380733.6A CN107169118A (en) 2017-05-25 2017-05-25 A kind of quick urban information searching system

Publications (1)

Publication Number Publication Date
CN107169118A true CN107169118A (en) 2017-09-15

Family

ID=59820704

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710380733.6A Pending CN107169118A (en) 2017-05-25 2017-05-25 A kind of quick urban information searching system

Country Status (1)

Country Link
CN (1) CN107169118A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210411A1 (en) * 2008-02-15 2009-08-20 Oki Electric Industry Co., Ltd. Information Retrieving System
CN106202514A (en) * 2016-07-21 2016-12-07 北京邮电大学 Accident based on Agent is across the search method of media information and system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090210411A1 (en) * 2008-02-15 2009-08-20 Oki Electric Industry Co., Ltd. Information Retrieving System
CN106202514A (en) * 2016-07-21 2016-12-07 北京邮电大学 Accident based on Agent is across the search method of media information and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨月华: "基于领域知识模型的突发事件智能信息检索系统研究", 《中国博士学位论文全文数据库(电子期刊)》 *

Similar Documents

Publication Publication Date Title
KR102196400B1 (en) Determining hotword suitability
CN104078044B (en) The method and apparatus of mobile terminal and recording search thereof
CN110019732B (en) Intelligent question answering method and related device
CN100545847C (en) A kind of method and system that blog articles is sorted
CN109508458B (en) Legal entity identification method and device
CN108536807B (en) Information processing method and device
CN101566998A (en) Chinese question-answering system based on neural network
CN103794211B (en) A kind of audio recognition method and system
CN102246169A (en) Assigning an indexing weight to a search term
CN112560450A (en) Text error correction method and device
CN103871402A (en) Language model training system, a voice identification system and corresponding method
CN105791446A (en) Method, device and system for processing private lending
CN110705292A (en) Entity name extraction method based on knowledge base and deep learning
CN116150651A (en) AI-based depth synthesis detection method and system
US20230054726A1 (en) Query-focused extractive text summarization of textual data
CN107169118A (en) A kind of quick urban information searching system
TW201407390A (en) Data clustering apparatus and method
CN103247316A (en) Method and system for constructing index in voice frequency retrieval
CN115331675A (en) Method and device for processing user voice
CN111177585A (en) Map POI feedback method and device
CN112115237B (en) Construction method and device of tobacco science and technology literature data recommendation model
CN111368028B (en) Method and device for recommending question respondents
CN107193802A (en) A kind of smart field concept auto acquisition system
CN108376365B (en) Bank number determining method and device
CN107205029A (en) A kind of efficient electronic burst event management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170915