CN108256078A - Information acquisition method and device - Google Patents

Information acquisition method and device Download PDF

Info

Publication number
CN108256078A
CN108256078A CN201810049805.3A CN201810049805A CN108256078A CN 108256078 A CN108256078 A CN 108256078A CN 201810049805 A CN201810049805 A CN 201810049805A CN 108256078 A CN108256078 A CN 108256078A
Authority
CN
China
Prior art keywords
risk
content
pages
word
target entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810049805.3A
Other languages
Chinese (zh)
Other versions
CN108256078B (en
Inventor
方照发
陈科
汪凯
张发恩
唐进
郭江亮
尹世明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201810049805.3A priority Critical patent/CN108256078B/en
Publication of CN108256078A publication Critical patent/CN108256078A/en
Application granted granted Critical
Publication of CN108256078B publication Critical patent/CN108256078B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Abstract

The embodiment of the present application discloses information acquisition method and device.One specific embodiment of this method includes:Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;Determine whether the content of pages includes the risk word in default risk vocabulary;In response to determining that the content of pages includes the risk word in default risk vocabulary, the risk word in default risk vocabulary included by based on the content of pages determines risk information corresponding with the target entity, wherein, the risk information includes risk classifications, the risk classifications of the risky word of associated storage and each risk word in the default risk vocabulary.The accuracy of risk information corresponding with target entity determined by can improving.

Description

Information acquisition method and device
Technical field
The invention relates to field of computer technology, and in particular to Internet technical field more particularly to information obtain Take method and apparatus.
Background technology
With the development of internet, internet has been increasingly becoming one of main medium.In Internet era, either about The information of enterprise all gradually becomes transparent again with respect to the information of people.With some enterprise relevant information (for example, the employee of enterprise Composition information, rewards and punishments information, financial situation information etc.) except through the publication of its official website content obtained except, also It can be believed by user in the comment about enterprise of the publications such as internet multiple-service portal website, forum, news, blog, microblogging It ceases to obtain.
Although we can be obtained by internet about enterprise or about personal information, usually obtain Information about an enterprise or individual is unilateral.
Invention content
The embodiment of the present application proposes a kind of information acquisition method and device.
In a first aspect, the embodiment of the present application provides a kind of information acquisition method, this method includes:Acquisition and target entity The content of pages of corresponding webpage within a predetermined period of time;Determine whether content of pages includes the risk in default risk vocabulary Word;In response to determining that content of pages includes the risk word in default risk vocabulary, based on the default risk included by content of pages Risk word in vocabulary determines risk information corresponding with target entity, wherein, risk information includes risk classifications, presets risk The risk classifications of the risky word of associated storage and each risk word in vocabulary.
In some embodiments, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained, including: Capture the content of pages of multiple preset webs in predetermined amount of time;Determine the content of pages of each acquired preset web With the degree of association of target entity;Will with the degree of association of target entity be more than predetermined threshold preset web content of pages as with The content of pages of the corresponding webpage of target entity.
In some embodiments, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained, including: Crawl and the content of pages of the associated webpage of target entity from search engine.
In some embodiments, content of pages includes title and whether determining content of pages includes default risk vocabulary In risk word, including:Determine whether title includes the risk word in default risk vocabulary.
In some embodiments, information acquisition method further includes:If it is determined that content of pages is not included in default risk vocabulary Risk word, then to title carry out sentiment analysis;If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, right Content of pages carries out semantic analysis;Risk classifications according to corresponding to semantic analysis result determines content of pages.
In some embodiments, risk information further includes risk factor, and information acquisition method further includes:Based on target entity The title number occurred in content of pages, the risk factor of location determination target entity that occurs in content of pages, In, number positive correlation that the title of risk factor and target entity occurs in content of pages, and the risk factor of target entity with The tandem that position occurs in content of pages in the title of target entity is negatively correlated.
In some embodiments, determine whether content of pages includes the risk word in default risk vocabulary, including:Extract page The keyword of face content;Determine whether each risk word in default risk vocabulary matches with keyword.
In some embodiments, risk information further includes risk factor and in response to determining that it is default that content of pages includes Risk word in risk vocabulary is determined and target entity pair based on the risk word in the default risk vocabulary included by content of pages The risk information answered, including:Match in response at least one of default risk vocabulary risk word with keyword, based on pass The matched risk word of keyword determines risk classifications corresponding with target entity and risk factor.
In some embodiments, risk information further includes risk class, preset the risky word of risk vocabulary associated storage with And the risk class corresponding to each risk word;And in response to determining that content of pages includes the risk word in default risk vocabulary, Risk information corresponding with target entity is determined based on the risk word in the default risk vocabulary included by content of pages, including: Risk classifications corresponding with target entity and risk are determined based on the risk word in the default risk vocabulary included by content of pages Grade.
Second aspect, the embodiment of the present application provide a kind of information acquisition device, which includes:Acquiring unit, configuration For obtaining the content of pages of webpage corresponding with target entity within a predetermined period of time;First determination unit is configured to really Determine whether content of pages includes presetting the risk word in risk vocabulary;Second determination unit is configured in response to determining the page Content includes the risk word in default risk vocabulary, is determined based on the risk word in the default risk vocabulary included by content of pages Risk information corresponding with target entity, wherein, risk information includes risk classifications, and associated storage has wind in default risk vocabulary The risk classifications of dangerous word and each risk word.
In some embodiments, acquiring unit is further configured to:Capture multiple preset webs in predetermined amount of time Content of pages;Determine the acquired content of pages of each preset web and the degree of association of target entity;It will be with target reality The degree of association of body is more than content of pages of the content of pages of the preset web of predetermined threshold as webpage corresponding with target entity.
In some embodiments, acquiring unit is further configured to:It captures from search engine and is associated with target entity Webpage content of pages.
In some embodiments, content of pages includes title and the first determination unit is further configured to:Determine mark Whether topic includes the risk word in default risk vocabulary.
In some embodiments, information acquisition device further includes third determination unit, and third determination unit is configured to:If It determines that content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to title;If sentiment analysis result Indicated Sentiment orientation is negative sense Sentiment orientation, and semantic analysis is carried out to content of pages;Page is determined according to semantic analysis result Risk classifications corresponding to the content of face.
In some embodiments, risk information further includes risk factor, and information acquisition device further includes the 4th determination unit; 4th determination unit is configured to:Number that title based on target entity occurs in content of pages goes out in content of pages The risk factor of existing location determination target entity, wherein, what the title of risk factor and target entity occurred in content of pages Number positive correlation, and there is the tandem of position in content of pages in the risk factor of target entity and the title of target entity It is negatively correlated.
In some embodiments, the first determination unit is further configured to:Extract the keyword of content of pages;It determines pre- If whether each risk word in risk vocabulary matches with keyword.
In some embodiments, risk information further includes risk factor and the second determination unit is further configured to: Match in response at least one of default risk vocabulary risk word with keyword, based on true with the risk word of Keywords matching Fixed risk classifications corresponding with target entity and risk factor.
In some embodiments, risk information further includes risk class, preset the risky word of risk vocabulary associated storage with And the risk class corresponding to each risk word;And second determination unit be further configured to:Based on included by content of pages Default risk vocabulary in risk word determine risk classifications corresponding with target entity and risk class.
The third aspect, the embodiment of the present application provide a kind of server, which includes:One or more processors; Storage device, for storing one or more programs, when said one or multiple programs are held by said one or multiple processors During row so that the method for said one or the realization of multiple processors as described in realization method any in first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey Sequence, wherein, the method as described in realization method any in first aspect is realized when which is executed by processor.
Information acquisition method and device provided by the embodiments of the present application, by obtaining webpage corresponding with target entity pre- The content of pages fixed time in section, then determines whether the content of pages includes the risk word in default risk vocabulary, finally In response to determining that the content of pages includes the risk word in default risk vocabulary, based on default included by the content of pages Risk word in risk vocabulary determines risk information corresponding with the target entity, can improve identified and target entity The accuracy of corresponding risk information.It is pushed away when by the information that method or apparatus disclosed in the embodiment of the present application is applied to get When sending, the specific aim of pushed information can be promoted.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the information acquisition method of the application;
Fig. 3 is the flow chart according to another embodiment of the information acquisition method of the application;
Fig. 4 is the flow chart according to another embodiment of the information acquisition method of the application;
Fig. 5 is the structure diagram according to one embodiment of the information acquisition device of the application;
Fig. 6 is adapted for the structure diagram of the computer system of the server for realizing the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention rather than the restriction to the invention.It also should be noted that in order to Convenient for description, illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, system architecture 100 can include terminal device 101,102,103, first server 104, second takes Business device 105 and network 106,107,108.Network 106 is between terminal device 101,102,103 and first server 104 The medium of communication link is provided.Network 107 is used to provide communication link between first server 104 and second server 105 Medium.Network 108 is used to pass through the medium of communication link between second server 105 and terminal device 101,102,103. Network 106, network 107 and network 108 can include various connection types, such as wired, wireless communication link or optical fiber electricity Cable etc..
User can be interacted with using terminal equipment 101,102,103 by network 106 with first server 104, to receive Or send message etc..User can be interacted with using terminal equipment 101,102,103 by network 108 with second server 105, with Receive or send message etc..Various client applications, such as web browser can be installed on terminal device 101,102,103 Using, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, wrap It includes but is not limited to smart mobile phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) it is player, on knee portable Computer and desktop computer etc..
First server 104 can be interacted by network 107 with second server 105, to receive or send message etc..
Second server 105 can be to provide the server of various services, such as can be by network 108 from terminal device 101st, 102, the 103 searching request data for receiving and storing mass users, and according to searching request data to terminal device 101, 102nd, the background server of 103 offer corresponding web page contents.First server 104 can be to provide the server of various services, Such as searching request data are obtained, and searching request data are handled from second server 105 by network 107, it will locate Reason result is pushed to the background server of terminal device 101,102,103.
It should be noted that the information-pushing method that the embodiment of the present application is provided generally is performed by first server 104, Correspondingly, information push-delivery apparatus is generally positioned in first server 104.
It is worth noting that in application scenes, first server and second server can be same physics Server.
It should be understood that the terminal device, network and first server, the number of second server in Fig. 1 are only to illustrate Property.According to needs are realized, can have any number of terminal device, network and first server and second server.
With continued reference to Fig. 2, it illustrates the flows 200 of one embodiment of the information acquisition method according to the application.It should Information acquisition method includes the following steps:
Step 201, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
In the present embodiment, electronic equipment (such as the first server shown in FIG. 1 of information acquisition method operation thereon It 104) can be by wired connection mode or radio connection from web content server (such as the second clothes shown in FIG. 1 Be engaged in device 105) obtain predetermined period in webpage corresponding with target entity content of pages.Here predetermined period can be packet In predetermined period containing current time, such as predetermined period can be from 24 hours before current time, until when current Carve this period only.In addition, predetermined period can also be the period being arbitrarily designated.Above-mentioned target entity for example can be One preassigned economic entity (a such as enterprise).
Here, webpage corresponding with target entity for example can be the webpage in the own website of target entity.It is above-mentioned with The content of pages of the corresponding webpage of target entity can be the content of pages of the webpage in target entity own website.In addition, with The content of pages of the corresponding webpage of target entity can also be by web crawlers according to the title of target entity and from internet The content of pages of the webpage crawled.The title of target entity can be the full name of target entity or the abbreviation of target entity.Make The content of pages of webpage corresponding with target entity crawled with web crawlers can be the real including target of the administration for industry and commerce's publication The content of pages of the webpage of the title of body can be the page of the webpage of the title including target entity of portal website of stock market publication Face content can be the content of pages of the webpage of the title including target entity of news portal website orientation.In addition above-mentioned net The content of pages of page can also be the publication public sentiment letter such as forum, blog, microblogging, mhkc group, video, social networks, electronics report The content of pages of the webpage including target entity title of breath.
Step 202, determine whether content of pages includes the risk word in default risk vocabulary.
Default risk vocabulary can be previously provided in the electronic equipment of information acquisition method operation thereon.It is or above-mentioned Electronic equipment can be arranged on the default risk vocabulary on other physical equipments by network access.In the present embodiment, it is based on After the content of pages of the webpage corresponding with target entity got in step 201 within a predetermined period of time, above-mentioned electronics is set Standby (such as first server 104 shown in FIG. 1) can be analyzed above-mentioned content of pages using various analysis means, so as to Determine whether the content of pages of webpage includes the risk word in default risk vocabulary.
In the present embodiment, above-mentioned risk classifications for example can include but is not limited to:The market risk, product risks, operation Risk, investment risk, exchange risk, Personnel risk, system risk, purchasing and merging risk, natural hybridized orbit, quality risk, policy Risk, legal risk, diplomatic risk etc..
Each risk classifications can correspond to multiple risk words.Here risk word can be word, can also be phrase.Such as through The corresponding risk word of battalion's risk can include but is not limited to:It pays one's debts with all his assets, lose, shutting down, merging and acquisition, backdoor recombination, stopping Production, equity pledge, unable to make ends meet, laid-off, bankruptcy are recombinated, be can't regain one's original capital.The corresponding risk word of Personnel risk can include But it is not limited to:Brain drain, appoint it is improper, resign collectively.The corresponding risk word of purchasing and merging risk includes:Malicious purchase, malice are closed And.The corresponding risk word of legal risk for example can include but is not limited to pollution environment, evade taxes, evades taxation, dumping maliciously, employ child Work etc..It can be with the risky word of associated storage and the risk classifications of each risk word in above-mentioned default risk vocabulary.
In this example, above-mentioned electronic equipment can be by any one risk word in default risk vocabulary and the above-mentioned page Content is matched, if the risk word and any word successful match in content of pages, it is determined that content of pages includes default wind Risk word in dangerous vocabulary.
Step 203, it in response to determining that content of pages includes the risk word in default risk vocabulary, is wrapped based on content of pages Risk word in the default risk vocabulary included determines risk information corresponding with target entity.
In the present embodiment, in response to determining that the content of pages of above-mentioned webpage includes the risk word in default risk vocabulary, Above-mentioned electronic equipment can be according to the risk class corresponding to the risk word in the default risk vocabulary included by above-mentioned content of pages Type determines risk information corresponding with target entity.Wherein, risk information corresponding with target entity can include the above-mentioned page The risk classifications corresponding to the risk word in default risk vocabulary included by content.That is, above-mentioned electronic equipment is by page The risk classifications corresponding to the risk word in default risk vocabulary included by the content of face are determined as wind corresponding with target entity Dangerous type.
In some optional realization methods of the present embodiment, risk letter corresponding with target entity is determined in step 203 After breath, risk information can be pushed to default terminal device by above-mentioned electronic equipment.Here default terminal device can be Terminal device corresponding with target entity.When target entity is an economic entity, above-mentioned terminal corresponding with target entity Equipment for example can be terminal device used in the decision-maker of economic entity.In addition, default terminal device can also be with Terminal device used in the staff of the relevant financial institution of target entity.
The method that the above embodiments of the present application provide is determining by pair being analyzed with the relevant content of pages of target entity Risk information corresponding with target entity, so as to improve the accuracy of the corresponding risk information of identified target entity.
When the information that method or apparatus disclosed in the embodiment of the present application is applied to get is pushed, it can be promoted and pushed away It delivers letters the specific aim of breath.
In some optional realization methods of the present embodiment, step 201 obtains webpage corresponding with target entity predetermined Content of pages in period may further include following sub-step:
Step 2011, the content of pages of multiple preset webs in predetermined amount of time is captured.
In these optional realization methods, the mode that increment crawl may be used in above-mentioned electronic equipment is predetermined every one Period obtains the content of pages of primary multiple preset webs.The content of pages of multiple preset webs obtained every time is issued at it Time it is upper non-overlapping.To ensure that the content of pages of multiple preset webs captured can timely update.Here default net Page can be the webpage corresponding to preassigned uniform resource locator (Uniform Resource Location, URL), Can also be preassigned financial web site, lawsuit website, administrative penalty information announcement website webpage etc..It is above-mentioned pre- If webpage can reflect the information of financial, the illegal and administrative penalty of target entity etc., due to these webpages in itself The information issued can be the information by strictly auditing, therefore the confidence level of the content of pages of these preset webs is higher, It is advantageously ensured that based on determined by the content of pages of these webpages risk information corresponding with target entity accuracy.
Step 2012, the acquired content of pages of each preset web and the degree of association of target entity are determined.
In these optional realization methods, above-mentioned electronic equipment can determine the title of target entity, and target is real The title of body is matched with the content of pages of preset web, is occurred in the page of preset web according to the title of target entity Number determine the degree of association of the content of pages of preset web and target entity.The title of target entity goes out in content of pages The degree of association positive correlation of existing number and the content of pages of webpage and target entity.The title of goal entity can be mesh The full name of entity is marked, can also be the abbreviation of target entity.The abbreviation of target entity can be from the mass data of internet Summarize what is obtained.
In addition, when target entity is an economic entity, above-mentioned electronic equipment can also be pre-set and economic entity The corresponding default degree of association of each attribute.When the content of pages of a preset web is including related to an attribute of economic entity Content when, the corresponding default degree of association of the attribute can be determined the content of pages of preset web and being associated with for economic entity Degree.Such as an attribute of economic entity is the economic activity type be engaged in.When the content of pages of preset web include with It, can be according to the warp being engaged in the economic entity during the relevant information of economic activity type that the economic entity is engaged in The corresponding default degree of association of Activity Type help to determine the degree of association between the content of pages of preset web and the economic entity.Example The enterprise for being such as engaged in clothes foreign trade for one, corresponding economic activity attribute are " clothes foreign trade ".As one in internet When the content of pages of a preset web includes the information of " clothes foreign trade enters severe winter ", due to the content of pages of the preset web Include " clothes foreign trade ", therefore can will be determined as the page of webpage with " economic activity type attribute " the corresponding default degree of association Face content and the degree of association of the target entity.
Step 2013, the content of pages for the preset web that default degree of association threshold value is more than with the degree of association of target entity is made Content of pages for webpage corresponding with target entity.
In these optional realization methods, above-mentioned electronic equipment can will be more than predetermined threshold with the degree of association of target entity Content of pages of the content of pages of the preset web of value as webpage corresponding with target entity.Here predetermined threshold can root Factually the application on border is set, and is not limited herein.
So, above-mentioned electronic equipment may be used increment Grasp Modes and be selected from the content of pages of multiple preset webs It selects and analyzes risk information corresponding with target entity with the content of pages of preset web that the degree of association of target entity is larger, true With determining and target on the one hand can be simplified on the basis of the instantaneity of the risk information corresponding to target entity determined by protecting The process of the corresponding risk information of entity, on the other hand can improve determined according to the content of pages of preset web and mesh Mark the accuracy of the corresponding risk information of entity.
In some optional realization methods of the present embodiment, above-mentioned steps 201 obtain webpage corresponding with target entity and exist Content of pages in predetermined amount of time may further include crawl and the page of the associated webpage of target entity from search engine Face content.Above-mentioned electronic equipment can be captured and the associated net of target entity from search engine by the way of increment crawl The content of pages of page namely the content of pages every the primary multiple preset webs of predetermined amount of time acquisition.It obtains every time The content of pages of multiple preset webs is non-overlapping on the time that it is issued.To ensure to be captured and the associated net of target entity The content of pages of page can timely update.
In application scenes, it can be captured in predetermined amount of time in a search engine simultaneously in a manner that increment captures With the content of pages of multiple preset webs in the content of pages and crawl predetermined amount of time of the associated webpage of target entity.It is and right Captured from search engine in predetermined amount of time with the content of pages of the associated webpage of target entity and capturing predetermined amount of time The content of pages of interior multiple preset webs performs cluster operation.It can be to one of those for the content of pages of same class webpage Or the content of pages of a small number of webpages carries out risk information analysis.So, on the one hand can cause crawl and target The content of pages of the corresponding webpage of entity is more comprehensive, on the other hand can also reduce analysis risk letter corresponding with target entity The calculation amount of breath.
In some optional realization methods of the present embodiment, content of pages can include title.In above-mentioned steps 202 really Determine whether content of pages includes risk word in default risk vocabulary, may further include determining content of pages title whether Including presetting the risk word in risk vocabulary.If title includes the risk word in default risk vocabulary, above-mentioned electronic equipment can be with Risk classifications corresponding to risk word in default risk vocabulary included in title are determined as corresponding with target entity Risk classifications.So, it can reduce and answering for risk information corresponding with target entity is determined according to the content of pages of webpage Miscellaneous degree is conducive to improve the speed for determining risk information corresponding with target entity.
In some optional realization methods of the present embodiment, in response to determining that it is pre- that content of pages includes in above-mentioned steps 203 If the risk word in risk vocabulary, determined by the risk word in default risk vocabulary based on included by content of pages with it is described In the corresponding risk information of target entity, other than including risk classifications, risk factor can also be included.Here risk system Count the probability for characterizing risk generation.
In these optional realization methods, above-mentioned electronic equipment can also be based further on the title of target entity in page The number occurred in the content of face, the risk factor of location determination target entity occurred in content of pages.Wherein, risk factor The number positive correlation occurred with the title of target entity in content of pages, and the name of the risk factor of target entity and target entity Claim the tandem for occurring position in content of pages negatively correlated.Also it just says, goes out in content of pages with the title of target entity Existing number is more, and the probability that the risk indicated by risk classifications corresponding with target entity occurs is bigger.With target entity The position that title occurs in content of pages is more forward, and the risk indicated by risk classifications corresponding with target entity occurs general Rate is bigger.
In some optional realization methods of the present embodiment, in response to determining the content of pages packet in above-mentioned steps 203 Include the risk word in default risk vocabulary, determined by the risk word in default risk vocabulary based on included by content of pages with In the corresponding risk information of target entity, other than including risk classifications, risk class is further included.In above-mentioned electronic equipment The risky word of associated storage and the risk class corresponding to each risk word in the default risk vocabulary of storage.Above-mentioned electronic equipment It, can be based on the default risk included by content of pages in response to determining that content of pages includes the risk word in default risk vocabulary Risk word in vocabulary determines risk classifications corresponding with target entity and risk class.It specifically, can be by content of pages institute Including default risk vocabulary in the how corresponding risk classifications of risk word and risk class be determined as it is corresponding with target entity Risk classifications and risk class.Here risk class is such as can include advanced risk, intermediate risk, rudimentary risk.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information Content of pages.Above-mentioned electronic equipment can determine whether the title of the content of pages of webpage includes the risk word in risk vocabulary. If the title of above-mentioned content of pages includes the risk word in default risk vocabulary, can be according to the title of above-mentioned content of pages In risk word in included default risk vocabulary determine risk classifications corresponding with target entity and and target entity The corresponding risk class of corresponding risk classifications.
With further reference to Fig. 3, it illustrates the flows 300 of another embodiment of information acquisition method.The acquisition of information The flow 300 of method, includes the following steps:
Step 301, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
Step 301 is identical with the step 201 in embodiment illustrated in fig. 2, does not repeat herein.
Step 302, determine whether content of pages includes the risk word in default risk vocabulary.
Step 302 is identical with the step 202 in embodiment illustrated in fig. 2, does not repeat herein.
Step 303, it in response to determining that content of pages includes the risk word in default risk vocabulary, is wrapped based on content of pages Risk word in the default risk vocabulary included determines risk information corresponding with target entity.
Step 303 is identical with the step 203 in embodiment illustrated in fig. 2, does not repeat herein.
Step 304, in response to determining that content of pages does not include the risk word in default risk vocabulary, then to title into market Sense analysis.
In the present embodiment, the content of pages of above-mentioned webpage can include title.When the page for determining webpage in step 302 When face content does not include the risk word in default risk vocabulary, the electronic equipment of information acquisition method operation thereon is (such as Fig. 1 institutes The first server 104 shown) it can be in response to determining that content of pages does not include the risk word in default risk vocabulary, and then uses Various sentiment analysis methods carry out sentiment analysis to the title of the content of pages of webpage.Here sentiment analysis method for example can be with Including:Sentiment analysis method based on dictionary, the sentiment analysis method based on machine learning are mixed based on dictionary with machine learning Sentiment analysis method, the sentiment analysis method based on weak markup information and the sentiment analysis method based on deep learning.It needs Illustrate, above-mentioned each sentiment analysis method is the known technology studied and applied extensively at present, and details are not described herein.
Step 305, if the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, language is carried out to content of pages Justice analysis.
When the Sentiment orientation indicated by the sentiment analysis result for the title for obtaining above-mentioned Webpage content in step 304 During for negative sense Sentiment orientation, above-mentioned electronic equipment can further use semantic analysis to the content of pages of webpage into semanteme Analysis.Here semantic analysis is such as can be the theme model semantics analysis method.
It should be noted that above-mentioned semantic analysis is the known technology studied and applied extensively at present, herein no longer It repeats.
Step 306, the risk classifications according to corresponding to semantic analysis result determines content of pages.
The semanteme point that above-mentioned electronic equipment can be obtained according to semantic analysis is carried out in step 305 to the content of pages of webpage Analysis result determines the risk classifications corresponding to the content of pages of webpage.
That is, above-mentioned electronic equipment can extract the semanteme corresponding to the content of pages of webpage, and according in the page Hold the corresponding risk classifications of corresponding semantic determining content of pages.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information Content of pages.After above-mentioned electronic equipment determines that the content of pages of webpage does not include any risk word in default risk vocabulary, Can semantic analysis be carried out according to the content of pages to above-mentioned webpage, the content of pages of webpage is determined according to the result of semantic analysis Corresponding risk classifications.
From figure 3, it can be seen that compared with the corresponding embodiments of Fig. 2, the flow of the information acquisition method in the present embodiment 300 highlight the tendentiousness analyzed by Sentiment orientation determine title first, for title for negative sense Sentiment orientation then to page Face content carries out semantic analysis, and the corresponding risk information of content of pages is determined according to semantic analysis result.The present embodiment can as a result, To improve the comprehensive of risk information corresponding with target entity, the specific aim of information push is further promoted.
With further reference to Fig. 4, it illustrates the flows 400 of another embodiment of information acquisition method.The acquisition of information The flow 400 of method, includes the following steps:
Step 401, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
Step 401 is identical with the step 201 in embodiment illustrated in fig. 2, does not repeat herein.
Step 402, the keyword of content of pages is extracted.
In the present embodiment, the electronic equipment of information acquisition method operation thereon obtains and target reality in step 401 After the content of pages of the corresponding webpage of body within a predetermined period of time, can net be extracted according to the method for various extraction keywords The keyword of the content of pages of page.
Step 403, determine whether each risk word in default risk vocabulary matches with keyword.
In the present embodiment, it is extracted in step 402 after the keyword of content of pages of webpage, above-mentioned electronic equipment It can determine whether each risk word in default risk vocabulary matches with keyword.It specifically, can be by default risk vocabulary In each risk word matched in the content of pages of above-mentioned webpage one by one.Default risk word can be determined by the above method Whether each risk word in table matches with the keyword of the content of pages of webpage.
Step 404, in response to determining that at least one risk word matches with keyword in default risk vocabulary, based on pass The risk word that keyword matches determines risk classifications corresponding with target entity and risk factor.
In the present embodiment, in response to the page of at least one of determining default risk vocabulary risk word and above-mentioned webpage The keyword of content matches, and above-mentioned electronic equipment can be true by the risk classifications corresponding to the risk word to match with keyword It is set to risk classifications corresponding with target entity.Further, above-mentioned electronic equipment can determine wind corresponding with target entity The risk factor of dangerous type.Here risk factor can be come determining according to preset rules.It for example, can be in preset rules When there is risk word in setting keyword, the risk factor corresponding to the corresponding risk classifications of risk word.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information Content of pages.Above-mentioned electronic equipment the webpage for determining publication administrative penalty information content of pages and target entity (such as Economic entity) it is related after, the keyword of above-mentioned content of pages can be extracted, when keyword and the risk word of above-mentioned content of pages When a risk word in table matches, then risk classifications corresponding with target entity can be determined for the risk with Keywords matching The risk classifications corresponding to risk word in vocabulary.Further, it is also possible to it determines corresponding to risk classifications corresponding with target entity Risk factor.The numerical value of the corresponding risk factor of risk classifications that the Keywords matching of usual Webpage content goes out is larger, That is the probability that the risk indicated by risk classifications occurs is larger.
Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, the flow of the information acquisition method in the present embodiment 400 highlight the keyword of extraction content of pages, determine that risk corresponding with target entity is believed according to the keyword of content of pages Cease step.So, the step of determining risk information corresponding with target entity, the opposing party can be on the one hand further simplified Face also can further improve the accuracy of identified risk information corresponding with target entity.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of acquisition of information dresses The one embodiment put, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively In kind electronic equipment.
As shown in figure 5, the information acquisition device 500 of the present embodiment includes:Acquiring unit 501, the first determination unit 502, Second determination unit 503.Wherein, acquiring unit 501 are configured to obtain webpage corresponding with target entity in predetermined amount of time Interior content of pages;First determination unit 502 is configured to determine the risk whether content of pages is included in default risk vocabulary Word;Second determination unit 503 is configured to, in response to determining that content of pages includes the risk word in default risk vocabulary, be based on The risk word in default risk vocabulary included by content of pages determines risk information corresponding with target entity, wherein, risk Information includes risk classifications, presets the risk classifications of the risky word of associated storage and each risk word in risk vocabulary.
In the present embodiment, the acquiring unit 501 of information acquisition device 500, the first determination unit 502 and second determine list The specific processing of member 503 and its caused technique effect can respectively with reference to step 201 in 2 corresponding embodiment of figure, step 202 and The related description of step 203, details are not described herein.
In some optional realization methods of the present embodiment, acquiring unit 501 is further configured to:Capture the predetermined time The content of pages of multiple preset webs in section;Determine the content of pages and target entity of acquired each preset web The degree of association;The content of pages of the preset web of predetermined threshold will be more than with the degree of association of target entity as corresponding with target entity Webpage content of pages.
In some optional realization methods of the present embodiment, acquiring unit 501 is further configured to:From search engine Crawl and the content of pages of the associated webpage of target entity.
In some optional realization methods of the present embodiment, content of pages include title and the first determination unit 502 into One step is configured to:Determine whether title includes the risk word in default risk vocabulary.
In some optional realization methods of the present embodiment, information acquisition device further includes third determination unit 504.Third Determination unit 504 is configured to:If it is determined that content of pages does not include the risk word in default risk vocabulary, then to title into market Sense analysis;If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to content of pages;Root The risk classifications corresponding to content of pages are determined according to semantic analysis result.
In some optional realization methods of the present embodiment, risk information further includes risk factor, and information acquisition device is also Including the 4th determination unit 505.4th determination unit 505 is configured to:Gone out in content of pages based on the title of target entity Existing number, the risk factor of location determination target entity occurred in content of pages, wherein, risk factor and target entity The number positive correlation that occurs in content of pages of title, and the risk factor of target entity and the title of target entity are in the page The tandem for occurring position in appearance is negatively correlated.
In some optional realization methods of the present embodiment, the first determination unit 502 is further configured to:Extract the page The keyword of content;Determine whether each risk word in default risk vocabulary matches with keyword.
In some optional realization methods of the present embodiment, risk information further includes risk factor and the second determining list Member 503 is further configured to:Match in response at least one of default risk vocabulary risk word with keyword, based on The risk word of Keywords matching determines risk classifications corresponding with target entity and risk factor.
In some optional realization methods of the present embodiment, risk information further includes risk class, presets risk vocabulary and closes Risk class corresponding to the connection risky word of storage and each risk word;And second determination unit 503 be further configured to: Risk classifications corresponding with target entity and risk are determined based on the risk word in the default risk vocabulary included by content of pages Grade.
Below with reference to Fig. 6, it illustrates suitable for being used for realizing the computer of the terminal device/server of the embodiment of the present application The structure diagram of system 600.Terminal device/server shown in Fig. 6 is only an example, should not be to the embodiment of the present application Function and use scope bring any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU, Central Processing Unit) 601, it can be according to the program being stored in read-only memory (ROM, Read Only Memory) 602 or from storage section 608 programs being loaded into random access storage device (RAM, Random Access Memory) 603 and perform it is various appropriate Action and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.CPU 601、ROM 602 and RAM 603 is connected with each other by bus 604.Input/output (I/O, Input/Output) interface 605 is also connected to Bus 604.
I/O interfaces 605 are connected to lower component:Importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode Spool (CRT, Cathode Ray Tube), liquid crystal display (LCD, Liquid Crystal Display) etc. and loud speaker Deng output par, c 607;Storage section 608 including hard disk etc.;And including such as LAN (LAN, Local Area Network) the communications portion 609 of the network interface card of card, modem etc..Communications portion 609 is via such as internet Network performs communication process.Driver 610 is also according to needing to be connected to I/O interfaces 605.Detachable media 611, such as disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 610 as needed, in order to from the calculating read thereon Machine program is mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description Software program.For example, embodiment of the disclosure includes a kind of computer program product, including being carried on computer-readable medium On computer program, which includes for the program code of the method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 609 and/or from detachable media 611 are mounted.When the computer program is performed by central processing unit (CPU) 601, perform what is limited in the present processes Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but It is not limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor or arbitrary above combination. The more specific example of computer readable storage medium can include but is not limited to:Electrical connection with one or more conducting wires, Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And In the application, computer-readable signal media can include the data letter propagated in a base band or as a carrier wave part Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use In by instruction execution system, device either device use or program in connection.It is included on computer-readable medium Program code any appropriate medium can be used to transmit, including but not limited to:Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang Any appropriate combination stated.
Flow chart and block diagram in attached drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey Architectural framework in the cards, function and the operation of sequence product.In this regard, each box in flow chart or block diagram can generation The part of one module of table, program segment or code, the part of the module, program segment or code include one or more use In the executable instruction of logic function as defined in realization.It should also be noted that it in some implementations as replacements, is marked in box The function of note can also be occurred with being different from the sequence marked in attached drawing.For example, two boxes succeedingly represented are actually It can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depended on the functions involved.Also it to note Meaning, the combination of each box in block diagram and/or flow chart and the box in block diagram and/or flow chart can be with holding The dedicated hardware based system of functions or operations as defined in row is realized or can use specialized hardware and computer instruction Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard The mode of part is realized.Described unit can also be set in the processor, for example, can be described as:A kind of processor packet Include acquiring unit, the first determination unit and the second determination unit.Wherein, the title of these units is not formed under certain conditions To the restriction of the unit in itself, for example, acquiring unit is also described as " obtaining webpage corresponding with target entity predetermined The unit of content of pages in period ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating Machine readable medium carries one or more program, when said one or multiple programs are performed by the device so that should Device:Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;It is pre- to determine whether content of pages includes If the risk word in risk vocabulary;In response to determining that content of pages includes the risk word in default risk vocabulary, based in the page The risk word held in included default risk vocabulary determines risk information corresponding with target entity, wherein, risk information packet Risk classifications are included, preset the risk classifications of the risky word of associated storage and each risk word in risk vocabulary.
The preferred embodiment and the explanation to institute's application technology principle that above description is only the application.People in the art Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the specific combination of above-mentioned technical characteristic forms Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature The other technical solutions for arbitrarily combining and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein The technical solution that the technical characteristic of energy is replaced mutually and formed.

Claims (20)

1. a kind of information acquisition method, including:
Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;
Determine whether the content of pages includes the risk word in default risk vocabulary;
In response to determining that the content of pages includes the risk word in default risk vocabulary, based on included by the content of pages Risk word in default risk vocabulary determines risk information corresponding with the target entity, wherein, the risk information includes Risk classifications, the risk classifications of the risky word of associated storage and each risk word in the default risk vocabulary.
It is 2. described to obtain webpage corresponding with target entity within a predetermined period of time according to the method described in claim 1, wherein Content of pages, including:
Capture the content of pages of multiple preset webs in predetermined amount of time;
Determine the content of pages of each acquired preset web and the degree of association of the target entity;
To be more than with the degree of association of target entity the content of pages of the preset web of default degree of association threshold value as with target entity The content of pages of corresponding webpage.
It is 3. described to obtain webpage corresponding with target entity within a predetermined period of time according to the method described in claim 1, wherein Content of pages, including:
Crawl and the content of pages of the associated webpage of the target entity from search engine.
4. according to the method described in claim 2, wherein, the content of pages include title and
It is described to determine whether the content of pages includes the risk word in default risk vocabulary, including:
Determine whether the title includes the risk word in default risk vocabulary.
5. according to the method described in claim 4, wherein, the method further includes:
If it is determined that the content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to the title;
If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to the content of pages;
Risk classifications according to corresponding to semantic analysis result determines the content of pages.
6. according to the method described in claim 1, wherein, the risk information further includes risk factor, the method further includes:
Number that title based on the target entity occurs in the content of pages, the position occurred in the content of pages The risk factor of the determining target entity is put, wherein, the title of the risk factor and the target entity is in the page The number positive correlation that content occurs, and the title of the risk factor of the target entity and the target entity is in the page The tandem for occurring position in appearance is negatively correlated.
7. according to the method described in claim 1, wherein, whether the determining content of pages is included in default risk vocabulary Risk word, including:
Extract the keyword of the content of pages;
Determine whether each risk word in the default risk vocabulary matches with the keyword.
8. according to the method described in claim 7, wherein, the risk information further include risk factor and
The risk word included in response to the determining content of pages in default risk vocabulary, is wrapped based on the content of pages Risk word in the default risk vocabulary included determines risk information corresponding with the target entity, including:
Match in response at least one of default risk vocabulary risk word with the keyword, based on the keyword The risk word matched determines risk classifications corresponding with the target entity and risk factor.
9. according to the method described in claim 1, wherein, the risk information further includes risk class, the default risk word Risk class corresponding to the risky word of table associated storage and each risk word;And
The risk word included in response to the determining content of pages in default risk vocabulary, is wrapped based on the content of pages Risk word in the default risk vocabulary included determines risk information corresponding with the target entity, including:
The risk word in default risk vocabulary included by based on the content of pages determines wind corresponding with the target entity Dangerous type and risk class.
10. a kind of information acquisition device, including:
Acquiring unit is configured to obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;
First determination unit is configured to determine the risk the word whether content of pages is included in default risk vocabulary;
Second determination unit is configured in response to determining that the content of pages includes the risk word in default risk vocabulary, base Risk word in the default risk vocabulary included by the content of pages determines risk information corresponding with the target entity, Wherein, the risk information includes risk classifications, the risky word of associated storage and each risk word in the default risk vocabulary Risk classifications.
11. device according to claim 10, wherein, the acquiring unit is further configured to:
Capture the content of pages of multiple preset webs in predetermined amount of time;
Determine the content of pages of each acquired preset web and the degree of association of the target entity;
The content of pages of the preset web of predetermined threshold will be more than with the degree of association of target entity as corresponding with target entity The content of pages of webpage.
12. device according to claim 10, wherein, the acquiring unit is further configured to:
Crawl and the content of pages of the associated webpage of target entity from search engine.
13. according to the devices described in claim 11, wherein, the content of pages include title and
First determination unit is further configured to:
Determine whether the title includes the risk word in default risk vocabulary.
14. device according to claim 12, wherein, described device further includes third determination unit, and the third determines Unit is configured to:
If it is determined that the content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to the title;
If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to the content of pages;
Risk classifications according to corresponding to semantic analysis result determines the content of pages.
15. device according to claim 10, wherein, the risk information further includes risk factor, and described device is also wrapped Include the 4th determination unit;
4th determination unit is configured to:
Number that title based on the target entity occurs in the content of pages, the position occurred in the content of pages The risk factor of the determining target entity is put, wherein, the title of the risk factor and the target entity is in the page The number positive correlation that content occurs, and the title of the risk factor of the target entity and the target entity is in the page The tandem for occurring position in appearance is negatively correlated.
16. device according to claim 10, wherein, first determination unit is further configured to:
Extract the keyword of the content of pages;
Determine whether each risk word in the default risk vocabulary matches with the keyword.
17. device according to claim 16, wherein, the risk information further include risk factor and
Second determination unit is further configured to:
Match in response at least one of default risk vocabulary risk word with the keyword, based on the keyword The risk word matched determines risk classifications corresponding with the target entity and risk factor.
18. device according to claim 10, wherein, the risk information further includes risk class, the default risk Risk class corresponding to the risky word of vocabulary associated storage and each risk word;And
Second determination unit is further configured to:
The risk word in default risk vocabulary included by based on the content of pages determines wind corresponding with the target entity Dangerous type and risk class.
19. a kind of server, including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are performed by one or more of processors so that one or more of processors are real The now method as described in any in claim 1-9.
20. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor The method as described in any in claim 1-9 is realized during execution.
CN201810049805.3A 2018-01-18 2018-01-18 Information acquisition method and device Active CN108256078B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810049805.3A CN108256078B (en) 2018-01-18 2018-01-18 Information acquisition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810049805.3A CN108256078B (en) 2018-01-18 2018-01-18 Information acquisition method and device

Publications (2)

Publication Number Publication Date
CN108256078A true CN108256078A (en) 2018-07-06
CN108256078B CN108256078B (en) 2019-07-12

Family

ID=62741320

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810049805.3A Active CN108256078B (en) 2018-01-18 2018-01-18 Information acquisition method and device

Country Status (1)

Country Link
CN (1) CN108256078B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914064A (en) * 2020-07-29 2020-11-10 王嘉兴 Text mining method, device, equipment and medium
CN112862305A (en) * 2021-02-03 2021-05-28 北京百度网讯科技有限公司 Method, device, equipment and storage medium for determining risk state of object
CN116910231A (en) * 2023-09-11 2023-10-20 社治无忧(成都)智慧科技有限公司 WeChat public opinion early warning method and system based on natural language processing

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951548A (en) * 2015-06-24 2015-09-30 烟台中科网络技术研究所 Method and system for calculating negative public opinion index
CN105844424A (en) * 2016-05-30 2016-08-10 中国计量学院 Product quality problem discovery and risk assessment method based on network comments
JP5972425B1 (en) * 2015-05-08 2016-08-17 株式会社エルプランニング Reputation damage risk report creation system, program and method
CN105956740A (en) * 2016-04-19 2016-09-21 北京深度时代科技有限公司 Semantic risk calculating method based on text logical characteristic
US20170004128A1 (en) * 2015-07-01 2017-01-05 Institute for Sustainable Development Device and method for analyzing reputation for objects by data mining

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5972425B1 (en) * 2015-05-08 2016-08-17 株式会社エルプランニング Reputation damage risk report creation system, program and method
CN104951548A (en) * 2015-06-24 2015-09-30 烟台中科网络技术研究所 Method and system for calculating negative public opinion index
US20170004128A1 (en) * 2015-07-01 2017-01-05 Institute for Sustainable Development Device and method for analyzing reputation for objects by data mining
CN105956740A (en) * 2016-04-19 2016-09-21 北京深度时代科技有限公司 Semantic risk calculating method based on text logical characteristic
CN105844424A (en) * 2016-05-30 2016-08-10 中国计量学院 Product quality problem discovery and risk assessment method based on network comments

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111914064A (en) * 2020-07-29 2020-11-10 王嘉兴 Text mining method, device, equipment and medium
CN112862305A (en) * 2021-02-03 2021-05-28 北京百度网讯科技有限公司 Method, device, equipment and storage medium for determining risk state of object
CN116910231A (en) * 2023-09-11 2023-10-20 社治无忧(成都)智慧科技有限公司 WeChat public opinion early warning method and system based on natural language processing
CN116910231B (en) * 2023-09-11 2023-11-17 社治无忧(成都)智慧科技有限公司 WeChat public opinion early warning method and system based on natural language processing

Also Published As

Publication number Publication date
CN108256078B (en) 2019-07-12

Similar Documents

Publication Publication Date Title
CN106383875B (en) Man-machine interaction method and device based on artificial intelligence
CN107491534A (en) Information processing method and device
CN107491547A (en) Searching method and device based on artificial intelligence
CN107729319A (en) Method and apparatus for output information
CN107295095A (en) The method and apparatus for pushing and showing advertisement
CN109492772A (en) The method and apparatus for generating information
CN104598218B (en) For merging and reusing the method and system of gateway information
CN107679217A (en) Association method for extracting content and device based on data mining
CN106407361A (en) Method and device for pushing information based on artificial intelligence
CN108572990A (en) Information-pushing method and device
CN108256078B (en) Information acquisition method and device
WO2019231772A1 (en) Systems and methods for crypto currency automated transaction flow detection
CN107634947A (en) Limitation malice logs in or the method and apparatus of registration
CN107943895A (en) Information-pushing method and device
CN107977678A (en) Method and apparatus for output information
CN107169077A (en) Method and apparatus for pushed information
CN107783962A (en) Method and device for query statement
US20220292160A1 (en) Automated system and method for creating structured data objects for a media-based electronic document
CN107807937A (en) A kind of website SEO processing methods, apparatus and system
CN108540508A (en) Method, apparatus and equipment for pushed information
CN112084342A (en) Test question generation method and device, computer equipment and storage medium
CN107368489A (en) A kind of information data processing method and device
CN107766498A (en) Method and apparatus for generating information
CN111078849A (en) Method and apparatus for outputting information
CN108959289B (en) Website category acquisition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant