CN108460020A - Method and device for obtaining information - Google Patents

Method and device for obtaining information Download PDF

Info

Publication number
CN108460020A
CN108460020A CN201810178539.4A CN201810178539A CN108460020A CN 108460020 A CN108460020 A CN 108460020A CN 201810178539 A CN201810178539 A CN 201810178539A CN 108460020 A CN108460020 A CN 108460020A
Authority
CN
China
Prior art keywords
information
file
crucial phrase
obtains
feature words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810178539.4A
Other languages
Chinese (zh)
Inventor
孙飞
刘明浩
邓射卫
韩超
朱翰闻
张发恩
郭江亮
唐进
尹世明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810178539.4A priority Critical patent/CN108460020A/en
Publication of CN108460020A publication Critical patent/CN108460020A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the present application discloses the method and device for obtaining information.One specific implementation mode of this method includes:At least one crucial phrase is extracted from the pending input information of reception, wherein crucial phrase includes the quantifier of number and corresponding number;Above-mentioned at least one crucial phrase is imported into Information query model trained in advance, obtains the target information for corresponding at least one crucial phrase, above- mentioned information interrogation model is used to characterize the correspondence between crucial phrase and target information.This embodiment improves the accuracys rate for the target information for obtaining corresponding crucial phrase.

Description

Method and device for obtaining information
Technical field
The invention relates to technical field of data processing, and in particular to field of computer technology, more particularly, to Obtain the method and device of information.
Background technology
With the development of information technology, the data of magnanimity are transmitted between the terminal device of user in several ways, pole The earth improves the efficiency that user obtains information.User is before obtaining information, usually firstly the need of passing through the information phase with needs Keyword of pass etc. carries out information search and gets search information;Then the information of needs is selected from search information again.
Invention content
The purpose of the embodiment of the present application is to propose the method and device for obtaining information.
In a first aspect, the embodiment of the present application provides a kind of method for obtaining information, this method includes:From reception At least one crucial phrase is extracted in pending input information, wherein crucial phrase includes the quantifier of number and corresponding number;It will Above-mentioned at least one crucial phrase imports Information query model trained in advance, obtains the mesh for corresponding at least one crucial phrase Information is marked, above- mentioned information interrogation model is used to characterize the correspondence between crucial phrase and target information.
In some embodiments, the above method includes the steps that structure Information query model, above-mentioned structure information inquiry mould The step of type includes:For each file in history file, the number in this document is extracted, and obtains the spy of the corresponding number Reference ceases, and features described above information includes quantifier;It will be with number using number and characteristic information as input using machine learning method As output, training obtains Information query model for word and the corresponding file content of characteristic information.
In some embodiments, features described above information further includes:Feature Words, features described above word are used to contain corresponding number Justice is described.
In some embodiments, the characteristic information that above-mentioned acquisition corresponds to the number includes:The file content of file is carried out Semantics recognition obtains the Feature Words of corresponding number.
In some embodiments, above-mentioned using number and characteristic information as input, it will be corresponding with number and characteristic information File content obtains Information query model as output, training, including:Based on number, corresponds to the quantifier of the number and correspond to and be somebody's turn to do The Feature Words of number build number label, and establish the corresponding pass between number label and the file content of file where number System.
In some embodiments, the step of above-mentioned structure Information query model further includes:Synonym expansion is carried out to Feature Words Exhibition, obtains the synonym collection of character pair word.
Second aspect, the embodiment of the present application provide a kind of device for obtaining information, which includes:Crucial phrase Extraction unit, for extracting at least one crucial phrase from the pending input information of reception, wherein crucial phrase includes number The quantifier of word and corresponding number;Information query unit, for above-mentioned at least one crucial phrase to be imported information trained in advance Interrogation model obtains the target information for corresponding at least one crucial phrase, and above- mentioned information interrogation model is for characterizing keyword Correspondence between group and target information.
In some embodiments, above-mentioned apparatus includes Information query model construction unit, for building Information query model, Above- mentioned information interrogation model construction unit includes:Acquisition of information subelement, for for each file in history file, extraction Number in this document, and the characteristic information of the corresponding number is obtained, features described above information includes quantifier;Information query model structure Subelement is built, it, will be corresponding with number and characteristic information using number and characteristic information as input for utilizing machine learning method File content as output, training obtain Information query model.
In some embodiments, features described above information further includes:Feature Words, features described above word are used to contain corresponding number Justice is described.
In some embodiments, above- mentioned information acquisition subelement includes:Semantics recognition is carried out to the file content of file, is obtained Take the Feature Words of corresponding number.
In some embodiments, above- mentioned information interrogation model structure subelement includes:Amount based on number, the corresponding number The Feature Words of word and the corresponding number build number label, and establish between number label and the file content of number place file Correspondence.
In some embodiments, above- mentioned information interrogation model construction unit further includes:Synonym extension is carried out to Feature Words, Obtain the synonym collection of character pair word.
The third aspect, the embodiment of the present application provide a kind of terminal device, including:One or more processors;Memory, For storing one or more programs, when said one or multiple programs are executed by said one or multiple processors so that Said one or multiple processors execute the method for obtaining information of above-mentioned first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer-readable medium, are stored thereon with computer program, It is characterized in that, which realizes the method for obtaining information of above-mentioned first aspect when being executed by processor.
Method and device provided by the embodiments of the present application for obtaining information is extracted from pending input information first Then crucial phrase import information interrogation model is obtained target information by least one crucial phrase for including number and quantifier, Improve the accuracy rate for the target information for obtaining corresponding crucial phrase.
Description of the drawings
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for obtaining information of the application;
Fig. 3 is the schematic diagram according to an application scenarios of the method for obtaining information of the application;
Fig. 4 is the structural schematic diagram according to one embodiment of the device for obtaining information of the application;
Fig. 5 is adapted for the structural schematic diagram of the system of the terminal device for realizing the embodiment of the present application.
Specific implementation mode
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, is illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the method for obtaining information that can apply the embodiment of the present application or the device for obtaining information Exemplary system architecture 100.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105. Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be interacted by network 104 with server 105 with using terminal equipment 101,102,103, to receive or send out Send message etc..Various telecommunication customer end applications can be installed, such as web browser is answered on terminal device 101,102,103 With the application of, searching class, information inquiry application etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, packet Include but be not limited to smart mobile phone, tablet computer, E-book reader, pocket computer on knee and desktop computer etc..
Server 105 can be to provide the server of various services, such as wait for what terminal device 101,102,103 was sent The crucial phrase that processing input information includes carries out the server of corresponding information search.Server can wait locating to what is received The reason data such as input information analyze etc. processing, and by the corresponding target information got be sent to terminal device 101, 102、103。
It should be noted that the method for obtaining information that the embodiment of the present application is provided generally is held by server 105 Row, correspondingly, the device for obtaining information is generally positioned in server 105.
It should be noted that server can be hardware, can also be software.When server is hardware, may be implemented At the distributed server cluster that multiple servers form, individual server can also be implemented as.It, can when server is software To be implemented as multiple softwares or software module (such as providing Distributed Services), single software or software can also be implemented as Module.It is not specifically limited herein.
It should be understood that the number of the terminal device, network and server in Fig. 1 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.
With continued reference to Fig. 2, the flow of one embodiment of the method for obtaining information according to the application is shown 200.The method for being used to obtain information includes the following steps:
Step 201, at least one crucial phrase is extracted from the pending input information of reception.
In the present embodiment, the method for obtaining information runs electronic equipment (such as service shown in FIG. 1 thereon Device 105) by wired connection mode or radio connection from user the terminal reception of information inquiry can be carried out using it Pending input information.It should be pointed out that above-mentioned radio connection can include but is not limited to 3G/4G connections, WiFi connects Connect, bluetooth connection, WiMAX connections, Zigbee connections, UWB (ultra wideband) connections and other it is currently known or will Come the radio connection developed.
When existing information searching method carries out information search, search information input by user can be word, number or text The information such as word sum number combinatorics on words.Existing information searching method is searched and the search immediate information of information usually in information bank When, the terminal device file content where the information being sent to where user, so that user selects the information of needs.When searching When rope information includes number and word, existing information searching method is usually preferentially inquired and the relevant information of word;And when search When information includes only number, the information inquired is all much invalid information.To certain highly professional and digital importance compared with When high file (such as can be various legal documents etc.) carries out information inquiry, existing information searching method often inquire less than Accurate information.
For this purpose, the electronic equipment of the application can wait locating to user by what terminal device 101,102,103 was sent first It manages input information to identify into row information, at least one crucial phrase is extracted from pending input information.Wherein, the pass of the application Keyword group may include the quantifier etc. of number and corresponding number.It, can be with when in pending input information without specific quantifier It is " a ", " item " etc. to give tacit consent to quantifier.For example, when pending input information is " relevant amount 5000 ", then corresponding crucial phrase Can be:" 5000, member ";When pending input information is " relevant amount 5000, time are no more than 1 month ", corresponding two A crucial phrase can be respectively:" 5000, member " and " 1, the moon ".It, can also be from waiting for according to actual pending input information Multiple crucial phrases are extracted in reason input information, specifically depending on actual conditions.
Step 202, above-mentioned at least one crucial phrase is imported into Information query model trained in advance, obtains corresponding to this extremely The target information of a few crucial phrase.
After extracting crucial phrase in pending input information, electronic equipment can inquire crucial phrase import information Model;Information query model carries out information inquiry according to crucial phrase, obtains the target information of corresponding crucial phrase.Wherein, believe Breath interrogation model is used to characterize the correspondence between crucial phrase and target information.As an example, Information query model can be with It is that technical staff is pre-established based on the statistics to a large amount of crucial phrase and target information, is stored with multiple crucial phrases With the mapping table of the correspondence of target information.For example, when crucial phrase is " 5000, member ", Information query model can be with Inquire all target informations with " 5000, member " relevant file;When crucial phrase is " 5000, member " and " 1, the moon ", letter Breath interrogation model can inquire all target informations with " 5000, member " and " 1, the moon " relevant file.It needs to illustrate It is that when crucial phrase is " 5000, member " and " 1, the moon ", obtained target information can meet " 5000, member " and " 1, the moon " simultaneously, " 5000, member " or " 1, the moon " can also only be met, it is specific depending on actual needs.
In some optional realization methods of the present embodiment, the above method may include the step for building Information query model Suddenly, the step of above-mentioned structure Information query model may comprise steps of:
The first step extracts the number in this document for each file in history file, and obtains the corresponding number Characteristic information.
Target information corresponding with crucial phrase is inquired by Information query model in order to realize, electronic equipment needs head Corresponding data processing first is carried out to each file in history file.The present embodiment can from each file extraction document packet The number contained;Then the characteristic information of the corresponding number is determined by the methods of semantics recognition again.Wherein, features described above information can To include quantifier.
Second step, will be with number and characteristic information pair using number and characteristic information as input using machine learning method The file content answered obtains Information query model as output, training.
After server 105 obtains the number of file and the characteristic information of corresponding number, number and characteristic information can be made For input, it is used as output, training to obtain Information query model file content corresponding with number and characteristic information.Specifically, Server 105 can use search engine (Search Engine) or approximate KNN (Approximate Nearest The models such as Neighbors), using above-mentioned number and characteristic information as the input of model, by above-mentioned with number and characteristic information pair The file content answered is exported as corresponding model, using machine learning method, is trained to the model, is obtained information inquiry Model.In this way, may be implemented:After number and characteristic information input information interrogation model, Information query model can inquire In file, with the number of input and the corresponding file content of characteristic information.
In some optional realization methods of the present embodiment, features described above information can also include:Feature Words, above-mentioned spy Sign word is used to that the meaning of corresponding number to be described.
Pending input information input by user can also include to be carried out to number other than it can include number and quantifier The entry of restriction.For example, pending input information can be:" the fine amount of money 5000 ", corresponding crucial phrase can be: " 5000, member, fine ".It is corresponding, when handling history file, semantic knowledge can also be carried out to the file content of file Not, the Feature Words of corresponding number are obtained.For example, a certain legal document states case, in those set forth, occur The entries such as " 5000 ", " member ", and case description has been carried out to the entries such as " 5000 ", " member ", but do not go out in case description information Existing " fine " entry.Electronic equipment can then carry out the case description information comprising " 5000 ", " member " semantics recognition, and then obtain It is " fine " to " 5000 ", " member " corresponding Feature Words, rather than the Feature Words such as other " bonuses ", " expense ".That is, this implementation The Feature Words that example obtains can not occurred directly in file, but to file handle by the methods of semantics recognition It arrives.In this way, when it is " 5000, member, fine " to extract crucial phrase from pending input information input by user, information is looked into Asking model can also be from including only in the file of " 5000 ", " member " entry, and it directly includes " to penalize accurately to find file content not Money " entry, but physical meaning is " fine ", file and file content comprising " 5000 ", " member " entry.As it can be seen that the application When avoiding to a certain extent directly through pending input information progress information inquiry input by user, it is accurate to find Target information the case where, substantially increase the probability that user obtains accurate target information.
It is above-mentioned using number and characteristic information as input in some optional realization methods of the present embodiment, it will be with number As output, training obtains Information query model, may include for word and the corresponding file content of characteristic information:Based on number, right Should the quantifier of number and the Feature Words of the corresponding number build number label, and establish number label and file where number Correspondence between file content.
In order to improve the speed and efficiency that Information query model inquires target information corresponding with crucial phrase, the present embodiment Electronic equipment be also based on the Feature Words structure number label of number, the quantifier of the corresponding number and the corresponding number;So Correspondence between the file content of file where resettling number label and number afterwards.In this way, each file correspondence is set At least one number label has been set, file content corresponding with number label can be navigated to by number label.Information inquiry Model can directly find corresponding file and file content by number label, improve when receiving crucial phrase Inquire the speed and efficiency of target information.
In some optional realization methods of the present embodiment, the step of above-mentioned structure Information query model, can also wrap It includes:Synonym extension is carried out to Feature Words, obtains the synonym collection of character pair word.
When inquiring certain technical documents, it is related that pending input information input by user may and not include technical document Proprietary vocabulary.In this way, tending not to accurately find target information by existing information querying method.For this purpose, the present embodiment Synonym extension can also be carried out to above-mentioned Feature Words, obtain the synonym collection of character pair word.When pending input is believed When in breath including the synonym of proprietary vocabulary, so that it may find accurate target with Information query model through this embodiment and believe Breath.For example, when the crucial phrase extracted from pending input information includes Feature Words, crucial phrase is input to information inquiry After model, corresponding target information may be found;Later, electronic equipment can also search the synonym collection of Feature Words, into one Step finds other possible target informations by the synonym of Feature Words.In this way, acquisition target information can be further increased Accuracy rate and efficiency.
Later, the target information got can be sent to corresponding terminal device 101,102,103 by electronic equipment, complete The process of target information is searched at terminal device 101,102,103.
It is a signal according to the application scenarios of the method for obtaining information of the present embodiment with continued reference to Fig. 3, Fig. 3 Figure.In the application scenarios of Fig. 3, terminal device 103 sends the " punishment of pending input information by network 104 to server 105 The amount of money 5000, the time was more than 1 month ", server 105 extracts two crucial phrases from pending input information and is respectively: " 5000, member, fine " and " 1, the moon ";Then, " 5000, member, fine " and " 1, the moon " are imported into Information query model, information is looked into It askes model and obtains corresponding target information from library (such as can be legal document library).
The method that above-described embodiment of the application provides is extracted from pending input information at least one comprising number first Then crucial phrase import information interrogation model is obtained target information by the crucial phrase of word and quantifier, improve acquisition and correspond to The accuracy rate of the target information of crucial phrase.
With further reference to Fig. 4, as the realization to method shown in above-mentioned each figure, this application provides one kind for obtaining letter One embodiment of the device of breath, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which can specifically answer For in various electronic equipments.
As shown in figure 4, the device 400 for obtaining information of the present embodiment may include:Crucial phrase extraction unit 401 With information query unit 402.Wherein, crucial phrase extraction unit 401 be used for from the pending input information of reception extraction to A few crucial phrase, wherein crucial phrase includes the quantifier of number and corresponding number;Information query unit 402 be used for by It states at least one crucial phrase and imports Information query model trained in advance, obtain the target for corresponding at least one crucial phrase Information, above- mentioned information interrogation model are used to characterize the correspondence between crucial phrase and target information.
In some optional realization methods of the present embodiment, the device 400 for obtaining information may include that information is looked into Model construction unit (not shown) is ask, for building Information query model, above- mentioned information interrogation model construction unit can be with Including:Acquisition of information subelement (not shown) and Information query model build subelement (not shown).Wherein, information It obtains subelement to be used to, for each file in history file, extract the number in this document, and obtains the corresponding number Characteristic information, features described above information include quantifier;Information query model builds subelement and is used to utilize machine learning method, will count File content corresponding with number and characteristic information is used as output, training to obtain information and look by word and characteristic information as input Ask model.
In some optional realization methods of the present embodiment, features described above information can also include:Feature Words, above-mentioned spy Sign word is used to that the meaning of corresponding number to be described.
In some optional realization methods of the present embodiment, above- mentioned information obtains subelement and may include:To file File content carries out semantics recognition, obtains the Feature Words of corresponding number.
In some optional realization methods of the present embodiment, above- mentioned information interrogation model structure subelement may include: Feature Words based on number, the quantifier of the corresponding number and the corresponding number build number label, and establish number label and number Correspondence between the file content of file where word.
In some optional realization methods of the present embodiment, above- mentioned information interrogation model construction unit further includes:To spy It levies word and carries out synonym extension, obtain the synonym collection of character pair word.
The present embodiment additionally provides a kind of terminal device, including:One or more processors;Memory, for storing one A or multiple programs, when said one or multiple programs are executed by said one or multiple processors so that said one or Multiple processors execute the above-mentioned method for obtaining information.
The present embodiment additionally provides a kind of computer-readable medium, is stored thereon with computer program, which is handled Device realizes the above-mentioned method for obtaining information when executing.
Below with reference to Fig. 5, it illustrates the computer systems 500 suitable for the terminal device for realizing the embodiment of the present application Structural schematic diagram.Terminal device shown in Fig. 5 is only an example, to the function of the embodiment of the present application and should not use model Shroud carrys out any restrictions.
As shown in figure 5, computer system 500 includes central processing unit (CPU) 501, it can be read-only according to being stored in Program in memory (ROM) 502 or be loaded into the program in random access storage device (RAM) 503 from storage section 508 and Execute various actions appropriate and processing.In RAM 503, also it is stored with system 500 and operates required various programs and data. CPU 501, ROM 502 and RAM 503 are connected with each other by bus 504.Input/output (I/O) interface 505 is also connected to always Line 504.
It is connected to I/O interfaces 505 with lower component:Importation 506 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 507 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 508 including hard disk etc.; And the communications portion 509 of the network interface card including LAN card, modem etc..Communications portion 509 via such as because The network of spy's net executes communication process.Driver 510 is also according to needing to be connected to I/O interfaces 505.Detachable media 511, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 510, as needed in order to be read from thereon Computer program be mounted into storage section 508 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed by communications portion 509 from network, and/or from detachable media 511 are mounted.When the computer program is executed by central processing unit (CPU) 501, limited in execution the present processes Above-mentioned function.
It should be noted that the above-mentioned computer-readable medium of the application can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two arbitrarily combines.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to:Electrical connection with one or more conducting wires, just It takes formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type and may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In this application, can be any include computer readable storage medium or storage journey The tangible medium of sequence, the program can be commanded the either device use or in connection of execution system, device.And at this In application, computer-readable signal media may include in a base band or as the data-signal that a carrier wave part is propagated, Wherein carry computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By instruction execution system, device either device use or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc. or above-mentioned Any appropriate combination.
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as:A kind of processor packet Include crucial phrase extraction unit and information query unit.Wherein, the title of these units is not constituted to this under certain conditions The restriction of unit itself, for example, information query unit is also described as " searching corresponding keyword by Information query model The unit of the target information of group ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are executed by the device so that should Device:At least one crucial phrase is extracted from the pending input information of reception, wherein crucial phrase includes number and correspondence The quantifier of number;Above-mentioned at least one crucial phrase is imported to Information query model trained in advance, obtain corresponding to this at least one The target information of a crucial phrase, above- mentioned information interrogation model are used to characterize the corresponding pass between crucial phrase and target information System.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein Can technical characteristic replaced mutually and the technical solution that is formed.

Claims (14)

1. a kind of method for obtaining information, which is characterized in that the method includes:
At least one crucial phrase is extracted from the pending input information of reception, wherein crucial phrase includes number and correspondence The quantifier of number;
At least one crucial phrase is imported into Information query model trained in advance, obtains corresponding at least one keyword The target information of group, described information interrogation model are used to characterize the correspondence between crucial phrase and target information.
2. according to the method described in claim 1, it is characterized in that, the method includes structure Information query model the step of, The step of structure Information query model includes:
For each file in history file, the number in this document is extracted, and obtains the characteristic information of the corresponding number, institute It includes quantifier to state characteristic information;
It, will be in file corresponding with number and characteristic information using number and characteristic information as input using machine learning method Hold as output, training obtains Information query model.
3. according to the method described in claim 2, it is characterized in that, the characteristic information further includes:Feature Words, the Feature Words It is described for the meaning to corresponding number.
4. according to the method described in claim 3, it is characterized in that, the characteristic information for obtaining the corresponding number includes:
Semantics recognition is carried out to the file content of file, obtains the Feature Words of corresponding number.
5., will be with number according to the method described in claim 3, it is characterized in that, described using number and characteristic information as input As output, training obtains Information query model for word and the corresponding file content of characteristic information, including:
Feature Words based on number, the quantifier of the corresponding number and the corresponding number build number label, and establish number label Correspondence between the file content of file where number.
6. according to the method described in claim 3, it is characterized in that, the step of structure Information query model further include:
Synonym extension is carried out to Feature Words, obtains the synonym collection of character pair word.
7. a kind of for obtaining the device of information, which is characterized in that described device includes:
Crucial phrase extraction unit, for extracting at least one crucial phrase from the pending input information of reception, wherein close Keyword group includes the quantifier of number and corresponding number;
Information query unit obtains pair at least one crucial phrase to be imported to Information query model trained in advance Should at least one crucial phrase target information, described information interrogation model is for characterizing between crucial phrase and target information Correspondence.
8. device according to claim 7, which is characterized in that described device includes Information query model construction unit, is used In structure Information query model, described information interrogation model construction unit includes:
Acquisition of information subelement, for for each file in history file, extracting the number in this document, and obtain correspondence The characteristic information of the number, the characteristic information include quantifier;
Information query model builds subelement,, will be with using number and characteristic information as inputting for utilizing machine learning method As output, training obtains Information query model for number and the corresponding file content of characteristic information.
9. device according to claim 8, which is characterized in that the characteristic information further includes:Feature Words, the Feature Words It is described for the meaning to corresponding number.
10. device according to claim 9, which is characterized in that described information obtains subelement and includes:
Semantics recognition is carried out to the file content of file, obtains the Feature Words of corresponding number.
11. device according to claim 9, which is characterized in that described information interrogation model builds subelement and includes:
Feature Words based on number, the quantifier of the corresponding number and the corresponding number build number label, and establish number label Correspondence between the file content of file where number.
12. device according to claim 9, which is characterized in that described information interrogation model construction unit further includes:
Synonym extension is carried out to Feature Words, obtains the synonym collection of character pair word.
13. a kind of terminal device, including:
One or more processors;
Memory, for storing one or more programs,
When one or more of programs are executed by one or more of processors so that one or more of processors Perform claim requires any method in 1 to 6.
14. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that the program is executed by processor In Shi Shixian such as claim 1 to 6 it is any as described in method.
CN201810178539.4A 2018-03-05 2018-03-05 Method and device for obtaining information Pending CN108460020A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810178539.4A CN108460020A (en) 2018-03-05 2018-03-05 Method and device for obtaining information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810178539.4A CN108460020A (en) 2018-03-05 2018-03-05 Method and device for obtaining information

Publications (1)

Publication Number Publication Date
CN108460020A true CN108460020A (en) 2018-08-28

Family

ID=63217231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810178539.4A Pending CN108460020A (en) 2018-03-05 2018-03-05 Method and device for obtaining information

Country Status (1)

Country Link
CN (1) CN108460020A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382554A (en) * 2018-12-11 2020-07-07 顺丰科技有限公司 Floor information extraction method and system
CN111460274A (en) * 2019-01-18 2020-07-28 北京字节跳动网络技术有限公司 Information processing method and device
CN115543925A (en) * 2022-12-02 2022-12-30 北京德风新征程科技有限公司 File processing method and device, electronic equipment and computer readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593816A (en) * 2013-11-25 2014-02-19 方正国际软件有限公司 Medical history document memorizing device and memorizing method
CN107563455A (en) * 2017-10-18 2018-01-09 百度在线网络技术(北京)有限公司 For obtaining the method and device of information

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593816A (en) * 2013-11-25 2014-02-19 方正国际软件有限公司 Medical history document memorizing device and memorizing method
CN107563455A (en) * 2017-10-18 2018-01-09 百度在线网络技术(北京)有限公司 For obtaining the method and device of information

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111382554A (en) * 2018-12-11 2020-07-07 顺丰科技有限公司 Floor information extraction method and system
CN111460274A (en) * 2019-01-18 2020-07-28 北京字节跳动网络技术有限公司 Information processing method and device
CN111460274B (en) * 2019-01-18 2023-04-28 北京字节跳动网络技术有限公司 Information processing method and device
CN115543925A (en) * 2022-12-02 2022-12-30 北京德风新征程科技有限公司 File processing method and device, electronic equipment and computer readable medium

Similar Documents

Publication Publication Date Title
CN108287927B (en) For obtaining the method and device of information
CN107491534A (en) Information processing method and device
CN107105031A (en) Information-pushing method and device
CN107491547A (en) Searching method and device based on artificial intelligence
CN108305626A (en) The sound control method and device of application program
CN108388674A (en) Method and apparatus for pushed information
CN108628830A (en) A kind of method and apparatus of semantics recognition
CN108229704A (en) For the method and apparatus of pushed information
CN108280200A (en) Method and apparatus for pushed information
CN108572990A (en) Information-pushing method and device
CN107590252A (en) Method and device for information exchange
CN107943895A (en) Information-pushing method and device
CN109635094A (en) Method and apparatus for generating answer
CN106919711A (en) The method and apparatus of the markup information based on artificial intelligence
CN107783962A (en) Method and device for query statement
CN107748879A (en) For obtaining the method and device of face information
CN110209677A (en) The method and apparatus of more new data
CN109145014A (en) The method and apparatus for generating elastic searching request
CN108121699A (en) For the method and apparatus of output information
CN108460020A (en) Method and device for obtaining information
US20220121668A1 (en) Method for recommending document, electronic device and storage medium
CN110119445A (en) The method and apparatus for generating feature vector and text classification being carried out based on feature vector
CN108959087A (en) test method and device
CN109933217A (en) Method and apparatus for pushing sentence
CN108021556A (en) For obtaining the method and device of information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180828