CN108256078A - Information acquisition method and device - Google Patents
Information acquisition method and device Download PDFInfo
- Publication number
- CN108256078A CN108256078A CN201810049805.3A CN201810049805A CN108256078A CN 108256078 A CN108256078 A CN 108256078A CN 201810049805 A CN201810049805 A CN 201810049805A CN 108256078 A CN108256078 A CN 108256078A
- Authority
- CN
- China
- Prior art keywords
- risk
- content
- pages
- word
- target entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Abstract
The embodiment of the present application discloses information acquisition method and device.One specific embodiment of this method includes:Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;Determine whether the content of pages includes the risk word in default risk vocabulary;In response to determining that the content of pages includes the risk word in default risk vocabulary, the risk word in default risk vocabulary included by based on the content of pages determines risk information corresponding with the target entity, wherein, the risk information includes risk classifications, the risk classifications of the risky word of associated storage and each risk word in the default risk vocabulary.The accuracy of risk information corresponding with target entity determined by can improving.
Description
Technical field
The invention relates to field of computer technology, and in particular to Internet technical field more particularly to information obtain
Take method and apparatus.
Background technology
With the development of internet, internet has been increasingly becoming one of main medium.In Internet era, either about
The information of enterprise all gradually becomes transparent again with respect to the information of people.With some enterprise relevant information (for example, the employee of enterprise
Composition information, rewards and punishments information, financial situation information etc.) except through the publication of its official website content obtained except, also
It can be believed by user in the comment about enterprise of the publications such as internet multiple-service portal website, forum, news, blog, microblogging
It ceases to obtain.
Although we can be obtained by internet about enterprise or about personal information, usually obtain
Information about an enterprise or individual is unilateral.
Invention content
The embodiment of the present application proposes a kind of information acquisition method and device.
In a first aspect, the embodiment of the present application provides a kind of information acquisition method, this method includes:Acquisition and target entity
The content of pages of corresponding webpage within a predetermined period of time;Determine whether content of pages includes the risk in default risk vocabulary
Word;In response to determining that content of pages includes the risk word in default risk vocabulary, based on the default risk included by content of pages
Risk word in vocabulary determines risk information corresponding with target entity, wherein, risk information includes risk classifications, presets risk
The risk classifications of the risky word of associated storage and each risk word in vocabulary.
In some embodiments, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained, including:
Capture the content of pages of multiple preset webs in predetermined amount of time;Determine the content of pages of each acquired preset web
With the degree of association of target entity;Will with the degree of association of target entity be more than predetermined threshold preset web content of pages as with
The content of pages of the corresponding webpage of target entity.
In some embodiments, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained, including:
Crawl and the content of pages of the associated webpage of target entity from search engine.
In some embodiments, content of pages includes title and whether determining content of pages includes default risk vocabulary
In risk word, including:Determine whether title includes the risk word in default risk vocabulary.
In some embodiments, information acquisition method further includes:If it is determined that content of pages is not included in default risk vocabulary
Risk word, then to title carry out sentiment analysis;If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, right
Content of pages carries out semantic analysis;Risk classifications according to corresponding to semantic analysis result determines content of pages.
In some embodiments, risk information further includes risk factor, and information acquisition method further includes:Based on target entity
The title number occurred in content of pages, the risk factor of location determination target entity that occurs in content of pages,
In, number positive correlation that the title of risk factor and target entity occurs in content of pages, and the risk factor of target entity with
The tandem that position occurs in content of pages in the title of target entity is negatively correlated.
In some embodiments, determine whether content of pages includes the risk word in default risk vocabulary, including:Extract page
The keyword of face content;Determine whether each risk word in default risk vocabulary matches with keyword.
In some embodiments, risk information further includes risk factor and in response to determining that it is default that content of pages includes
Risk word in risk vocabulary is determined and target entity pair based on the risk word in the default risk vocabulary included by content of pages
The risk information answered, including:Match in response at least one of default risk vocabulary risk word with keyword, based on pass
The matched risk word of keyword determines risk classifications corresponding with target entity and risk factor.
In some embodiments, risk information further includes risk class, preset the risky word of risk vocabulary associated storage with
And the risk class corresponding to each risk word;And in response to determining that content of pages includes the risk word in default risk vocabulary,
Risk information corresponding with target entity is determined based on the risk word in the default risk vocabulary included by content of pages, including:
Risk classifications corresponding with target entity and risk are determined based on the risk word in the default risk vocabulary included by content of pages
Grade.
Second aspect, the embodiment of the present application provide a kind of information acquisition device, which includes:Acquiring unit, configuration
For obtaining the content of pages of webpage corresponding with target entity within a predetermined period of time;First determination unit is configured to really
Determine whether content of pages includes presetting the risk word in risk vocabulary;Second determination unit is configured in response to determining the page
Content includes the risk word in default risk vocabulary, is determined based on the risk word in the default risk vocabulary included by content of pages
Risk information corresponding with target entity, wherein, risk information includes risk classifications, and associated storage has wind in default risk vocabulary
The risk classifications of dangerous word and each risk word.
In some embodiments, acquiring unit is further configured to:Capture multiple preset webs in predetermined amount of time
Content of pages;Determine the acquired content of pages of each preset web and the degree of association of target entity;It will be with target reality
The degree of association of body is more than content of pages of the content of pages of the preset web of predetermined threshold as webpage corresponding with target entity.
In some embodiments, acquiring unit is further configured to:It captures from search engine and is associated with target entity
Webpage content of pages.
In some embodiments, content of pages includes title and the first determination unit is further configured to:Determine mark
Whether topic includes the risk word in default risk vocabulary.
In some embodiments, information acquisition device further includes third determination unit, and third determination unit is configured to:If
It determines that content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to title;If sentiment analysis result
Indicated Sentiment orientation is negative sense Sentiment orientation, and semantic analysis is carried out to content of pages;Page is determined according to semantic analysis result
Risk classifications corresponding to the content of face.
In some embodiments, risk information further includes risk factor, and information acquisition device further includes the 4th determination unit;
4th determination unit is configured to:Number that title based on target entity occurs in content of pages goes out in content of pages
The risk factor of existing location determination target entity, wherein, what the title of risk factor and target entity occurred in content of pages
Number positive correlation, and there is the tandem of position in content of pages in the risk factor of target entity and the title of target entity
It is negatively correlated.
In some embodiments, the first determination unit is further configured to:Extract the keyword of content of pages;It determines pre-
If whether each risk word in risk vocabulary matches with keyword.
In some embodiments, risk information further includes risk factor and the second determination unit is further configured to:
Match in response at least one of default risk vocabulary risk word with keyword, based on true with the risk word of Keywords matching
Fixed risk classifications corresponding with target entity and risk factor.
In some embodiments, risk information further includes risk class, preset the risky word of risk vocabulary associated storage with
And the risk class corresponding to each risk word;And second determination unit be further configured to:Based on included by content of pages
Default risk vocabulary in risk word determine risk classifications corresponding with target entity and risk class.
The third aspect, the embodiment of the present application provide a kind of server, which includes:One or more processors;
Storage device, for storing one or more programs, when said one or multiple programs are held by said one or multiple processors
During row so that the method for said one or the realization of multiple processors as described in realization method any in first aspect.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence, wherein, the method as described in realization method any in first aspect is realized when which is executed by processor.
Information acquisition method and device provided by the embodiments of the present application, by obtaining webpage corresponding with target entity pre-
The content of pages fixed time in section, then determines whether the content of pages includes the risk word in default risk vocabulary, finally
In response to determining that the content of pages includes the risk word in default risk vocabulary, based on default included by the content of pages
Risk word in risk vocabulary determines risk information corresponding with the target entity, can improve identified and target entity
The accuracy of corresponding risk information.It is pushed away when by the information that method or apparatus disclosed in the embodiment of the present application is applied to get
When sending, the specific aim of pushed information can be promoted.
Description of the drawings
By reading the detailed description made to non-limiting example made with reference to the following drawings, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the information acquisition method of the application;
Fig. 3 is the flow chart according to another embodiment of the information acquisition method of the application;
Fig. 4 is the flow chart according to another embodiment of the information acquisition method of the application;
Fig. 5 is the structure diagram according to one embodiment of the information acquisition device of the application;
Fig. 6 is adapted for the structure diagram of the computer system of the server for realizing the embodiment of the present application.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
As shown in Figure 1, system architecture 100 can include terminal device 101,102,103, first server 104, second takes
Business device 105 and network 106,107,108.Network 106 is between terminal device 101,102,103 and first server 104
The medium of communication link is provided.Network 107 is used to provide communication link between first server 104 and second server 105
Medium.Network 108 is used to pass through the medium of communication link between second server 105 and terminal device 101,102,103.
Network 106, network 107 and network 108 can include various connection types, such as wired, wireless communication link or optical fiber electricity
Cable etc..
User can be interacted with using terminal equipment 101,102,103 by network 106 with first server 104, to receive
Or send message etc..User can be interacted with using terminal equipment 101,102,103 by network 108 with second server 105, with
Receive or send message etc..Various client applications, such as web browser can be installed on terminal device 101,102,103
Using, shopping class application, searching class application, instant messaging tools, mailbox client, social platform software etc..
Terminal device 101,102,103 can be the various electronic equipments with display screen and supported web page browsing, wrap
It includes but is not limited to smart mobile phone, tablet computer, E-book reader, MP3 player (Moving Picture Experts
Group Audio Layer III, dynamic image expert's compression standard audio level 3), MP4 (Moving Picture
Experts Group Audio Layer IV, dynamic image expert's compression standard audio level 4) it is player, on knee portable
Computer and desktop computer etc..
First server 104 can be interacted by network 107 with second server 105, to receive or send message etc..
Second server 105 can be to provide the server of various services, such as can be by network 108 from terminal device
101st, 102, the 103 searching request data for receiving and storing mass users, and according to searching request data to terminal device 101,
102nd, the background server of 103 offer corresponding web page contents.First server 104 can be to provide the server of various services,
Such as searching request data are obtained, and searching request data are handled from second server 105 by network 107, it will locate
Reason result is pushed to the background server of terminal device 101,102,103.
It should be noted that the information-pushing method that the embodiment of the present application is provided generally is performed by first server 104,
Correspondingly, information push-delivery apparatus is generally positioned in first server 104.
It is worth noting that in application scenes, first server and second server can be same physics
Server.
It should be understood that the terminal device, network and first server, the number of second server in Fig. 1 are only to illustrate
Property.According to needs are realized, can have any number of terminal device, network and first server and second server.
With continued reference to Fig. 2, it illustrates the flows 200 of one embodiment of the information acquisition method according to the application.It should
Information acquisition method includes the following steps:
Step 201, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
In the present embodiment, electronic equipment (such as the first server shown in FIG. 1 of information acquisition method operation thereon
It 104) can be by wired connection mode or radio connection from web content server (such as the second clothes shown in FIG. 1
Be engaged in device 105) obtain predetermined period in webpage corresponding with target entity content of pages.Here predetermined period can be packet
In predetermined period containing current time, such as predetermined period can be from 24 hours before current time, until when current
Carve this period only.In addition, predetermined period can also be the period being arbitrarily designated.Above-mentioned target entity for example can be
One preassigned economic entity (a such as enterprise).
Here, webpage corresponding with target entity for example can be the webpage in the own website of target entity.It is above-mentioned with
The content of pages of the corresponding webpage of target entity can be the content of pages of the webpage in target entity own website.In addition, with
The content of pages of the corresponding webpage of target entity can also be by web crawlers according to the title of target entity and from internet
The content of pages of the webpage crawled.The title of target entity can be the full name of target entity or the abbreviation of target entity.Make
The content of pages of webpage corresponding with target entity crawled with web crawlers can be the real including target of the administration for industry and commerce's publication
The content of pages of the webpage of the title of body can be the page of the webpage of the title including target entity of portal website of stock market publication
Face content can be the content of pages of the webpage of the title including target entity of news portal website orientation.In addition above-mentioned net
The content of pages of page can also be the publication public sentiment letter such as forum, blog, microblogging, mhkc group, video, social networks, electronics report
The content of pages of the webpage including target entity title of breath.
Step 202, determine whether content of pages includes the risk word in default risk vocabulary.
Default risk vocabulary can be previously provided in the electronic equipment of information acquisition method operation thereon.It is or above-mentioned
Electronic equipment can be arranged on the default risk vocabulary on other physical equipments by network access.In the present embodiment, it is based on
After the content of pages of the webpage corresponding with target entity got in step 201 within a predetermined period of time, above-mentioned electronics is set
Standby (such as first server 104 shown in FIG. 1) can be analyzed above-mentioned content of pages using various analysis means, so as to
Determine whether the content of pages of webpage includes the risk word in default risk vocabulary.
In the present embodiment, above-mentioned risk classifications for example can include but is not limited to:The market risk, product risks, operation
Risk, investment risk, exchange risk, Personnel risk, system risk, purchasing and merging risk, natural hybridized orbit, quality risk, policy
Risk, legal risk, diplomatic risk etc..
Each risk classifications can correspond to multiple risk words.Here risk word can be word, can also be phrase.Such as through
The corresponding risk word of battalion's risk can include but is not limited to:It pays one's debts with all his assets, lose, shutting down, merging and acquisition, backdoor recombination, stopping
Production, equity pledge, unable to make ends meet, laid-off, bankruptcy are recombinated, be can't regain one's original capital.The corresponding risk word of Personnel risk can include
But it is not limited to:Brain drain, appoint it is improper, resign collectively.The corresponding risk word of purchasing and merging risk includes:Malicious purchase, malice are closed
And.The corresponding risk word of legal risk for example can include but is not limited to pollution environment, evade taxes, evades taxation, dumping maliciously, employ child
Work etc..It can be with the risky word of associated storage and the risk classifications of each risk word in above-mentioned default risk vocabulary.
In this example, above-mentioned electronic equipment can be by any one risk word in default risk vocabulary and the above-mentioned page
Content is matched, if the risk word and any word successful match in content of pages, it is determined that content of pages includes default wind
Risk word in dangerous vocabulary.
Step 203, it in response to determining that content of pages includes the risk word in default risk vocabulary, is wrapped based on content of pages
Risk word in the default risk vocabulary included determines risk information corresponding with target entity.
In the present embodiment, in response to determining that the content of pages of above-mentioned webpage includes the risk word in default risk vocabulary,
Above-mentioned electronic equipment can be according to the risk class corresponding to the risk word in the default risk vocabulary included by above-mentioned content of pages
Type determines risk information corresponding with target entity.Wherein, risk information corresponding with target entity can include the above-mentioned page
The risk classifications corresponding to the risk word in default risk vocabulary included by content.That is, above-mentioned electronic equipment is by page
The risk classifications corresponding to the risk word in default risk vocabulary included by the content of face are determined as wind corresponding with target entity
Dangerous type.
In some optional realization methods of the present embodiment, risk letter corresponding with target entity is determined in step 203
After breath, risk information can be pushed to default terminal device by above-mentioned electronic equipment.Here default terminal device can be
Terminal device corresponding with target entity.When target entity is an economic entity, above-mentioned terminal corresponding with target entity
Equipment for example can be terminal device used in the decision-maker of economic entity.In addition, default terminal device can also be with
Terminal device used in the staff of the relevant financial institution of target entity.
The method that the above embodiments of the present application provide is determining by pair being analyzed with the relevant content of pages of target entity
Risk information corresponding with target entity, so as to improve the accuracy of the corresponding risk information of identified target entity.
When the information that method or apparatus disclosed in the embodiment of the present application is applied to get is pushed, it can be promoted and pushed away
It delivers letters the specific aim of breath.
In some optional realization methods of the present embodiment, step 201 obtains webpage corresponding with target entity predetermined
Content of pages in period may further include following sub-step:
Step 2011, the content of pages of multiple preset webs in predetermined amount of time is captured.
In these optional realization methods, the mode that increment crawl may be used in above-mentioned electronic equipment is predetermined every one
Period obtains the content of pages of primary multiple preset webs.The content of pages of multiple preset webs obtained every time is issued at it
Time it is upper non-overlapping.To ensure that the content of pages of multiple preset webs captured can timely update.Here default net
Page can be the webpage corresponding to preassigned uniform resource locator (Uniform Resource Location, URL),
Can also be preassigned financial web site, lawsuit website, administrative penalty information announcement website webpage etc..It is above-mentioned pre-
If webpage can reflect the information of financial, the illegal and administrative penalty of target entity etc., due to these webpages in itself
The information issued can be the information by strictly auditing, therefore the confidence level of the content of pages of these preset webs is higher,
It is advantageously ensured that based on determined by the content of pages of these webpages risk information corresponding with target entity accuracy.
Step 2012, the acquired content of pages of each preset web and the degree of association of target entity are determined.
In these optional realization methods, above-mentioned electronic equipment can determine the title of target entity, and target is real
The title of body is matched with the content of pages of preset web, is occurred in the page of preset web according to the title of target entity
Number determine the degree of association of the content of pages of preset web and target entity.The title of target entity goes out in content of pages
The degree of association positive correlation of existing number and the content of pages of webpage and target entity.The title of goal entity can be mesh
The full name of entity is marked, can also be the abbreviation of target entity.The abbreviation of target entity can be from the mass data of internet
Summarize what is obtained.
In addition, when target entity is an economic entity, above-mentioned electronic equipment can also be pre-set and economic entity
The corresponding default degree of association of each attribute.When the content of pages of a preset web is including related to an attribute of economic entity
Content when, the corresponding default degree of association of the attribute can be determined the content of pages of preset web and being associated with for economic entity
Degree.Such as an attribute of economic entity is the economic activity type be engaged in.When the content of pages of preset web include with
It, can be according to the warp being engaged in the economic entity during the relevant information of economic activity type that the economic entity is engaged in
The corresponding default degree of association of Activity Type help to determine the degree of association between the content of pages of preset web and the economic entity.Example
The enterprise for being such as engaged in clothes foreign trade for one, corresponding economic activity attribute are " clothes foreign trade ".As one in internet
When the content of pages of a preset web includes the information of " clothes foreign trade enters severe winter ", due to the content of pages of the preset web
Include " clothes foreign trade ", therefore can will be determined as the page of webpage with " economic activity type attribute " the corresponding default degree of association
Face content and the degree of association of the target entity.
Step 2013, the content of pages for the preset web that default degree of association threshold value is more than with the degree of association of target entity is made
Content of pages for webpage corresponding with target entity.
In these optional realization methods, above-mentioned electronic equipment can will be more than predetermined threshold with the degree of association of target entity
Content of pages of the content of pages of the preset web of value as webpage corresponding with target entity.Here predetermined threshold can root
Factually the application on border is set, and is not limited herein.
So, above-mentioned electronic equipment may be used increment Grasp Modes and be selected from the content of pages of multiple preset webs
It selects and analyzes risk information corresponding with target entity with the content of pages of preset web that the degree of association of target entity is larger, true
With determining and target on the one hand can be simplified on the basis of the instantaneity of the risk information corresponding to target entity determined by protecting
The process of the corresponding risk information of entity, on the other hand can improve determined according to the content of pages of preset web and mesh
Mark the accuracy of the corresponding risk information of entity.
In some optional realization methods of the present embodiment, above-mentioned steps 201 obtain webpage corresponding with target entity and exist
Content of pages in predetermined amount of time may further include crawl and the page of the associated webpage of target entity from search engine
Face content.Above-mentioned electronic equipment can be captured and the associated net of target entity from search engine by the way of increment crawl
The content of pages of page namely the content of pages every the primary multiple preset webs of predetermined amount of time acquisition.It obtains every time
The content of pages of multiple preset webs is non-overlapping on the time that it is issued.To ensure to be captured and the associated net of target entity
The content of pages of page can timely update.
In application scenes, it can be captured in predetermined amount of time in a search engine simultaneously in a manner that increment captures
With the content of pages of multiple preset webs in the content of pages and crawl predetermined amount of time of the associated webpage of target entity.It is and right
Captured from search engine in predetermined amount of time with the content of pages of the associated webpage of target entity and capturing predetermined amount of time
The content of pages of interior multiple preset webs performs cluster operation.It can be to one of those for the content of pages of same class webpage
Or the content of pages of a small number of webpages carries out risk information analysis.So, on the one hand can cause crawl and target
The content of pages of the corresponding webpage of entity is more comprehensive, on the other hand can also reduce analysis risk letter corresponding with target entity
The calculation amount of breath.
In some optional realization methods of the present embodiment, content of pages can include title.In above-mentioned steps 202 really
Determine whether content of pages includes risk word in default risk vocabulary, may further include determining content of pages title whether
Including presetting the risk word in risk vocabulary.If title includes the risk word in default risk vocabulary, above-mentioned electronic equipment can be with
Risk classifications corresponding to risk word in default risk vocabulary included in title are determined as corresponding with target entity
Risk classifications.So, it can reduce and answering for risk information corresponding with target entity is determined according to the content of pages of webpage
Miscellaneous degree is conducive to improve the speed for determining risk information corresponding with target entity.
In some optional realization methods of the present embodiment, in response to determining that it is pre- that content of pages includes in above-mentioned steps 203
If the risk word in risk vocabulary, determined by the risk word in default risk vocabulary based on included by content of pages with it is described
In the corresponding risk information of target entity, other than including risk classifications, risk factor can also be included.Here risk system
Count the probability for characterizing risk generation.
In these optional realization methods, above-mentioned electronic equipment can also be based further on the title of target entity in page
The number occurred in the content of face, the risk factor of location determination target entity occurred in content of pages.Wherein, risk factor
The number positive correlation occurred with the title of target entity in content of pages, and the name of the risk factor of target entity and target entity
Claim the tandem for occurring position in content of pages negatively correlated.Also it just says, goes out in content of pages with the title of target entity
Existing number is more, and the probability that the risk indicated by risk classifications corresponding with target entity occurs is bigger.With target entity
The position that title occurs in content of pages is more forward, and the risk indicated by risk classifications corresponding with target entity occurs general
Rate is bigger.
In some optional realization methods of the present embodiment, in response to determining the content of pages packet in above-mentioned steps 203
Include the risk word in default risk vocabulary, determined by the risk word in default risk vocabulary based on included by content of pages with
In the corresponding risk information of target entity, other than including risk classifications, risk class is further included.In above-mentioned electronic equipment
The risky word of associated storage and the risk class corresponding to each risk word in the default risk vocabulary of storage.Above-mentioned electronic equipment
It, can be based on the default risk included by content of pages in response to determining that content of pages includes the risk word in default risk vocabulary
Risk word in vocabulary determines risk classifications corresponding with target entity and risk class.It specifically, can be by content of pages institute
Including default risk vocabulary in the how corresponding risk classifications of risk word and risk class be determined as it is corresponding with target entity
Risk classifications and risk class.Here risk class is such as can include advanced risk, intermediate risk, rudimentary risk.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information
Content of pages.Above-mentioned electronic equipment can determine whether the title of the content of pages of webpage includes the risk word in risk vocabulary.
If the title of above-mentioned content of pages includes the risk word in default risk vocabulary, can be according to the title of above-mentioned content of pages
In risk word in included default risk vocabulary determine risk classifications corresponding with target entity and and target entity
The corresponding risk class of corresponding risk classifications.
With further reference to Fig. 3, it illustrates the flows 300 of another embodiment of information acquisition method.The acquisition of information
The flow 300 of method, includes the following steps:
Step 301, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
Step 301 is identical with the step 201 in embodiment illustrated in fig. 2, does not repeat herein.
Step 302, determine whether content of pages includes the risk word in default risk vocabulary.
Step 302 is identical with the step 202 in embodiment illustrated in fig. 2, does not repeat herein.
Step 303, it in response to determining that content of pages includes the risk word in default risk vocabulary, is wrapped based on content of pages
Risk word in the default risk vocabulary included determines risk information corresponding with target entity.
Step 303 is identical with the step 203 in embodiment illustrated in fig. 2, does not repeat herein.
Step 304, in response to determining that content of pages does not include the risk word in default risk vocabulary, then to title into market
Sense analysis.
In the present embodiment, the content of pages of above-mentioned webpage can include title.When the page for determining webpage in step 302
When face content does not include the risk word in default risk vocabulary, the electronic equipment of information acquisition method operation thereon is (such as Fig. 1 institutes
The first server 104 shown) it can be in response to determining that content of pages does not include the risk word in default risk vocabulary, and then uses
Various sentiment analysis methods carry out sentiment analysis to the title of the content of pages of webpage.Here sentiment analysis method for example can be with
Including:Sentiment analysis method based on dictionary, the sentiment analysis method based on machine learning are mixed based on dictionary with machine learning
Sentiment analysis method, the sentiment analysis method based on weak markup information and the sentiment analysis method based on deep learning.It needs
Illustrate, above-mentioned each sentiment analysis method is the known technology studied and applied extensively at present, and details are not described herein.
Step 305, if the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, language is carried out to content of pages
Justice analysis.
When the Sentiment orientation indicated by the sentiment analysis result for the title for obtaining above-mentioned Webpage content in step 304
During for negative sense Sentiment orientation, above-mentioned electronic equipment can further use semantic analysis to the content of pages of webpage into semanteme
Analysis.Here semantic analysis is such as can be the theme model semantics analysis method.
It should be noted that above-mentioned semantic analysis is the known technology studied and applied extensively at present, herein no longer
It repeats.
Step 306, the risk classifications according to corresponding to semantic analysis result determines content of pages.
The semanteme point that above-mentioned electronic equipment can be obtained according to semantic analysis is carried out in step 305 to the content of pages of webpage
Analysis result determines the risk classifications corresponding to the content of pages of webpage.
That is, above-mentioned electronic equipment can extract the semanteme corresponding to the content of pages of webpage, and according in the page
Hold the corresponding risk classifications of corresponding semantic determining content of pages.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information
Content of pages.After above-mentioned electronic equipment determines that the content of pages of webpage does not include any risk word in default risk vocabulary,
Can semantic analysis be carried out according to the content of pages to above-mentioned webpage, the content of pages of webpage is determined according to the result of semantic analysis
Corresponding risk classifications.
From figure 3, it can be seen that compared with the corresponding embodiments of Fig. 2, the flow of the information acquisition method in the present embodiment
300 highlight the tendentiousness analyzed by Sentiment orientation determine title first, for title for negative sense Sentiment orientation then to page
Face content carries out semantic analysis, and the corresponding risk information of content of pages is determined according to semantic analysis result.The present embodiment can as a result,
To improve the comprehensive of risk information corresponding with target entity, the specific aim of information push is further promoted.
With further reference to Fig. 4, it illustrates the flows 400 of another embodiment of information acquisition method.The acquisition of information
The flow 400 of method, includes the following steps:
Step 401, the content of pages of webpage corresponding with target entity within a predetermined period of time is obtained.
Step 401 is identical with the step 201 in embodiment illustrated in fig. 2, does not repeat herein.
Step 402, the keyword of content of pages is extracted.
In the present embodiment, the electronic equipment of information acquisition method operation thereon obtains and target reality in step 401
After the content of pages of the corresponding webpage of body within a predetermined period of time, can net be extracted according to the method for various extraction keywords
The keyword of the content of pages of page.
Step 403, determine whether each risk word in default risk vocabulary matches with keyword.
In the present embodiment, it is extracted in step 402 after the keyword of content of pages of webpage, above-mentioned electronic equipment
It can determine whether each risk word in default risk vocabulary matches with keyword.It specifically, can be by default risk vocabulary
In each risk word matched in the content of pages of above-mentioned webpage one by one.Default risk word can be determined by the above method
Whether each risk word in table matches with the keyword of the content of pages of webpage.
Step 404, in response to determining that at least one risk word matches with keyword in default risk vocabulary, based on pass
The risk word that keyword matches determines risk classifications corresponding with target entity and risk factor.
In the present embodiment, in response to the page of at least one of determining default risk vocabulary risk word and above-mentioned webpage
The keyword of content matches, and above-mentioned electronic equipment can be true by the risk classifications corresponding to the risk word to match with keyword
It is set to risk classifications corresponding with target entity.Further, above-mentioned electronic equipment can determine wind corresponding with target entity
The risk factor of dangerous type.Here risk factor can be come determining according to preset rules.It for example, can be in preset rules
When there is risk word in setting keyword, the risk factor corresponding to the corresponding risk classifications of risk word.
In application scenes, the content of pages of above-mentioned webpage for example can be the webpage for issuing administrative penalty information
Content of pages.Above-mentioned electronic equipment the webpage for determining publication administrative penalty information content of pages and target entity (such as
Economic entity) it is related after, the keyword of above-mentioned content of pages can be extracted, when keyword and the risk word of above-mentioned content of pages
When a risk word in table matches, then risk classifications corresponding with target entity can be determined for the risk with Keywords matching
The risk classifications corresponding to risk word in vocabulary.Further, it is also possible to it determines corresponding to risk classifications corresponding with target entity
Risk factor.The numerical value of the corresponding risk factor of risk classifications that the Keywords matching of usual Webpage content goes out is larger,
That is the probability that the risk indicated by risk classifications occurs is larger.
Figure 4, it is seen that compared with the corresponding embodiments of Fig. 2, the flow of the information acquisition method in the present embodiment
400 highlight the keyword of extraction content of pages, determine that risk corresponding with target entity is believed according to the keyword of content of pages
Cease step.So, the step of determining risk information corresponding with target entity, the opposing party can be on the one hand further simplified
Face also can further improve the accuracy of identified risk information corresponding with target entity.
With further reference to Fig. 5, as the realization to method shown in above-mentioned each figure, this application provides a kind of acquisition of information dresses
The one embodiment put, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, which specifically can be applied to respectively
In kind electronic equipment.
As shown in figure 5, the information acquisition device 500 of the present embodiment includes:Acquiring unit 501, the first determination unit 502,
Second determination unit 503.Wherein, acquiring unit 501 are configured to obtain webpage corresponding with target entity in predetermined amount of time
Interior content of pages;First determination unit 502 is configured to determine the risk whether content of pages is included in default risk vocabulary
Word;Second determination unit 503 is configured to, in response to determining that content of pages includes the risk word in default risk vocabulary, be based on
The risk word in default risk vocabulary included by content of pages determines risk information corresponding with target entity, wherein, risk
Information includes risk classifications, presets the risk classifications of the risky word of associated storage and each risk word in risk vocabulary.
In the present embodiment, the acquiring unit 501 of information acquisition device 500, the first determination unit 502 and second determine list
The specific processing of member 503 and its caused technique effect can respectively with reference to step 201 in 2 corresponding embodiment of figure, step 202 and
The related description of step 203, details are not described herein.
In some optional realization methods of the present embodiment, acquiring unit 501 is further configured to:Capture the predetermined time
The content of pages of multiple preset webs in section;Determine the content of pages and target entity of acquired each preset web
The degree of association;The content of pages of the preset web of predetermined threshold will be more than with the degree of association of target entity as corresponding with target entity
Webpage content of pages.
In some optional realization methods of the present embodiment, acquiring unit 501 is further configured to:From search engine
Crawl and the content of pages of the associated webpage of target entity.
In some optional realization methods of the present embodiment, content of pages include title and the first determination unit 502 into
One step is configured to:Determine whether title includes the risk word in default risk vocabulary.
In some optional realization methods of the present embodiment, information acquisition device further includes third determination unit 504.Third
Determination unit 504 is configured to:If it is determined that content of pages does not include the risk word in default risk vocabulary, then to title into market
Sense analysis;If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to content of pages;Root
The risk classifications corresponding to content of pages are determined according to semantic analysis result.
In some optional realization methods of the present embodiment, risk information further includes risk factor, and information acquisition device is also
Including the 4th determination unit 505.4th determination unit 505 is configured to:Gone out in content of pages based on the title of target entity
Existing number, the risk factor of location determination target entity occurred in content of pages, wherein, risk factor and target entity
The number positive correlation that occurs in content of pages of title, and the risk factor of target entity and the title of target entity are in the page
The tandem for occurring position in appearance is negatively correlated.
In some optional realization methods of the present embodiment, the first determination unit 502 is further configured to:Extract the page
The keyword of content;Determine whether each risk word in default risk vocabulary matches with keyword.
In some optional realization methods of the present embodiment, risk information further includes risk factor and the second determining list
Member 503 is further configured to:Match in response at least one of default risk vocabulary risk word with keyword, based on
The risk word of Keywords matching determines risk classifications corresponding with target entity and risk factor.
In some optional realization methods of the present embodiment, risk information further includes risk class, presets risk vocabulary and closes
Risk class corresponding to the connection risky word of storage and each risk word;And second determination unit 503 be further configured to:
Risk classifications corresponding with target entity and risk are determined based on the risk word in the default risk vocabulary included by content of pages
Grade.
Below with reference to Fig. 6, it illustrates suitable for being used for realizing the computer of the terminal device/server of the embodiment of the present application
The structure diagram of system 600.Terminal device/server shown in Fig. 6 is only an example, should not be to the embodiment of the present application
Function and use scope bring any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU, Central Processing Unit)
601, it can be according to the program being stored in read-only memory (ROM, Read Only Memory) 602 or from storage section
608 programs being loaded into random access storage device (RAM, Random Access Memory) 603 and perform it is various appropriate
Action and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.CPU 601、ROM
602 and RAM 603 is connected with each other by bus 604.Input/output (I/O, Input/Output) interface 605 is also connected to
Bus 604.
I/O interfaces 605 are connected to lower component:Importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
Spool (CRT, Cathode Ray Tube), liquid crystal display (LCD, Liquid Crystal Display) etc. and loud speaker
Deng output par, c 607;Storage section 608 including hard disk etc.;And including such as LAN (LAN, Local Area
Network) the communications portion 609 of the network interface card of card, modem etc..Communications portion 609 is via such as internet
Network performs communication process.Driver 610 is also according to needing to be connected to I/O interfaces 605.Detachable media 611, such as disk,
CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 610 as needed, in order to from the calculating read thereon
Machine program is mounted into storage section 608 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product, including being carried on computer-readable medium
On computer program, which includes for the program code of the method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 609 and/or from detachable media
611 are mounted.When the computer program is performed by central processing unit (CPU) 601, perform what is limited in the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but
It is not limited to --- electricity, magnetic, optical, electromagnetic, system, device or the device of infrared ray or semiconductor or arbitrary above combination.
The more specific example of computer readable storage medium can include but is not limited to:Electrical connection with one or more conducting wires,
Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store
The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And
In the application, computer-readable signal media can include the data letter propagated in a base band or as a carrier wave part
Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by instruction execution system, device either device use or program in connection.It is included on computer-readable medium
Program code any appropriate medium can be used to transmit, including but not limited to:Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
Flow chart and block diagram in attached drawing, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
Architectural framework in the cards, function and the operation of sequence product.In this regard, each box in flow chart or block diagram can generation
The part of one module of table, program segment or code, the part of the module, program segment or code include one or more use
In the executable instruction of logic function as defined in realization.It should also be noted that it in some implementations as replacements, is marked in box
The function of note can also be occurred with being different from the sequence marked in attached drawing.For example, two boxes succeedingly represented are actually
It can perform substantially in parallel, they can also be performed in the opposite order sometimes, this is depended on the functions involved.Also it to note
Meaning, the combination of each box in block diagram and/or flow chart and the box in block diagram and/or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit can also be set in the processor, for example, can be described as:A kind of processor packet
Include acquiring unit, the first determination unit and the second determination unit.Wherein, the title of these units is not formed under certain conditions
To the restriction of the unit in itself, for example, acquiring unit is also described as " obtaining webpage corresponding with target entity predetermined
The unit of content of pages in period ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be
Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are performed by the device so that should
Device:Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;It is pre- to determine whether content of pages includes
If the risk word in risk vocabulary;In response to determining that content of pages includes the risk word in default risk vocabulary, based in the page
The risk word held in included default risk vocabulary determines risk information corresponding with target entity, wherein, risk information packet
Risk classifications are included, preset the risk classifications of the risky word of associated storage and each risk word in risk vocabulary.
The preferred embodiment and the explanation to institute's application technology principle that above description is only the application.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to the technology that the specific combination of above-mentioned technical characteristic forms
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
The other technical solutions for arbitrarily combining and being formed.Such as features described above has similar work(with (but not limited to) disclosed herein
The technical solution that the technical characteristic of energy is replaced mutually and formed.
Claims (20)
1. a kind of information acquisition method, including:
Obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;
Determine whether the content of pages includes the risk word in default risk vocabulary;
In response to determining that the content of pages includes the risk word in default risk vocabulary, based on included by the content of pages
Risk word in default risk vocabulary determines risk information corresponding with the target entity, wherein, the risk information includes
Risk classifications, the risk classifications of the risky word of associated storage and each risk word in the default risk vocabulary.
It is 2. described to obtain webpage corresponding with target entity within a predetermined period of time according to the method described in claim 1, wherein
Content of pages, including:
Capture the content of pages of multiple preset webs in predetermined amount of time;
Determine the content of pages of each acquired preset web and the degree of association of the target entity;
To be more than with the degree of association of target entity the content of pages of the preset web of default degree of association threshold value as with target entity
The content of pages of corresponding webpage.
It is 3. described to obtain webpage corresponding with target entity within a predetermined period of time according to the method described in claim 1, wherein
Content of pages, including:
Crawl and the content of pages of the associated webpage of the target entity from search engine.
4. according to the method described in claim 2, wherein, the content of pages include title and
It is described to determine whether the content of pages includes the risk word in default risk vocabulary, including:
Determine whether the title includes the risk word in default risk vocabulary.
5. according to the method described in claim 4, wherein, the method further includes:
If it is determined that the content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to the title;
If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to the content of pages;
Risk classifications according to corresponding to semantic analysis result determines the content of pages.
6. according to the method described in claim 1, wherein, the risk information further includes risk factor, the method further includes:
Number that title based on the target entity occurs in the content of pages, the position occurred in the content of pages
The risk factor of the determining target entity is put, wherein, the title of the risk factor and the target entity is in the page
The number positive correlation that content occurs, and the title of the risk factor of the target entity and the target entity is in the page
The tandem for occurring position in appearance is negatively correlated.
7. according to the method described in claim 1, wherein, whether the determining content of pages is included in default risk vocabulary
Risk word, including:
Extract the keyword of the content of pages;
Determine whether each risk word in the default risk vocabulary matches with the keyword.
8. according to the method described in claim 7, wherein, the risk information further include risk factor and
The risk word included in response to the determining content of pages in default risk vocabulary, is wrapped based on the content of pages
Risk word in the default risk vocabulary included determines risk information corresponding with the target entity, including:
Match in response at least one of default risk vocabulary risk word with the keyword, based on the keyword
The risk word matched determines risk classifications corresponding with the target entity and risk factor.
9. according to the method described in claim 1, wherein, the risk information further includes risk class, the default risk word
Risk class corresponding to the risky word of table associated storage and each risk word;And
The risk word included in response to the determining content of pages in default risk vocabulary, is wrapped based on the content of pages
Risk word in the default risk vocabulary included determines risk information corresponding with the target entity, including:
The risk word in default risk vocabulary included by based on the content of pages determines wind corresponding with the target entity
Dangerous type and risk class.
10. a kind of information acquisition device, including:
Acquiring unit is configured to obtain the content of pages of webpage corresponding with target entity within a predetermined period of time;
First determination unit is configured to determine the risk the word whether content of pages is included in default risk vocabulary;
Second determination unit is configured in response to determining that the content of pages includes the risk word in default risk vocabulary, base
Risk word in the default risk vocabulary included by the content of pages determines risk information corresponding with the target entity,
Wherein, the risk information includes risk classifications, the risky word of associated storage and each risk word in the default risk vocabulary
Risk classifications.
11. device according to claim 10, wherein, the acquiring unit is further configured to:
Capture the content of pages of multiple preset webs in predetermined amount of time;
Determine the content of pages of each acquired preset web and the degree of association of the target entity;
The content of pages of the preset web of predetermined threshold will be more than with the degree of association of target entity as corresponding with target entity
The content of pages of webpage.
12. device according to claim 10, wherein, the acquiring unit is further configured to:
Crawl and the content of pages of the associated webpage of target entity from search engine.
13. according to the devices described in claim 11, wherein, the content of pages include title and
First determination unit is further configured to:
Determine whether the title includes the risk word in default risk vocabulary.
14. device according to claim 12, wherein, described device further includes third determination unit, and the third determines
Unit is configured to:
If it is determined that the content of pages does not include the risk word in default risk vocabulary, then sentiment analysis is carried out to the title;
If the Sentiment orientation indicated by sentiment analysis result is negative sense Sentiment orientation, semantic analysis is carried out to the content of pages;
Risk classifications according to corresponding to semantic analysis result determines the content of pages.
15. device according to claim 10, wherein, the risk information further includes risk factor, and described device is also wrapped
Include the 4th determination unit;
4th determination unit is configured to:
Number that title based on the target entity occurs in the content of pages, the position occurred in the content of pages
The risk factor of the determining target entity is put, wherein, the title of the risk factor and the target entity is in the page
The number positive correlation that content occurs, and the title of the risk factor of the target entity and the target entity is in the page
The tandem for occurring position in appearance is negatively correlated.
16. device according to claim 10, wherein, first determination unit is further configured to:
Extract the keyword of the content of pages;
Determine whether each risk word in the default risk vocabulary matches with the keyword.
17. device according to claim 16, wherein, the risk information further include risk factor and
Second determination unit is further configured to:
Match in response at least one of default risk vocabulary risk word with the keyword, based on the keyword
The risk word matched determines risk classifications corresponding with the target entity and risk factor.
18. device according to claim 10, wherein, the risk information further includes risk class, the default risk
Risk class corresponding to the risky word of vocabulary associated storage and each risk word;And
Second determination unit is further configured to:
The risk word in default risk vocabulary included by based on the content of pages determines wind corresponding with the target entity
Dangerous type and risk class.
19. a kind of server, including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are performed by one or more of processors so that one or more of processors are real
The now method as described in any in claim 1-9.
20. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is by processor
The method as described in any in claim 1-9 is realized during execution.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810049805.3A CN108256078B (en) | 2018-01-18 | 2018-01-18 | Information acquisition method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810049805.3A CN108256078B (en) | 2018-01-18 | 2018-01-18 | Information acquisition method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108256078A true CN108256078A (en) | 2018-07-06 |
CN108256078B CN108256078B (en) | 2019-07-12 |
Family
ID=62741320
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810049805.3A Active CN108256078B (en) | 2018-01-18 | 2018-01-18 | Information acquisition method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108256078B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111914064A (en) * | 2020-07-29 | 2020-11-10 | 王嘉兴 | Text mining method, device, equipment and medium |
CN112862305A (en) * | 2021-02-03 | 2021-05-28 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for determining risk state of object |
CN116910231A (en) * | 2023-09-11 | 2023-10-20 | 社治无忧(成都)智慧科技有限公司 | WeChat public opinion early warning method and system based on natural language processing |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104951548A (en) * | 2015-06-24 | 2015-09-30 | 烟台中科网络技术研究所 | Method and system for calculating negative public opinion index |
CN105844424A (en) * | 2016-05-30 | 2016-08-10 | 中国计量学院 | Product quality problem discovery and risk assessment method based on network comments |
JP5972425B1 (en) * | 2015-05-08 | 2016-08-17 | 株式会社エルプランニング | Reputation damage risk report creation system, program and method |
CN105956740A (en) * | 2016-04-19 | 2016-09-21 | 北京深度时代科技有限公司 | Semantic risk calculating method based on text logical characteristic |
US20170004128A1 (en) * | 2015-07-01 | 2017-01-05 | Institute for Sustainable Development | Device and method for analyzing reputation for objects by data mining |
-
2018
- 2018-01-18 CN CN201810049805.3A patent/CN108256078B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP5972425B1 (en) * | 2015-05-08 | 2016-08-17 | 株式会社エルプランニング | Reputation damage risk report creation system, program and method |
CN104951548A (en) * | 2015-06-24 | 2015-09-30 | 烟台中科网络技术研究所 | Method and system for calculating negative public opinion index |
US20170004128A1 (en) * | 2015-07-01 | 2017-01-05 | Institute for Sustainable Development | Device and method for analyzing reputation for objects by data mining |
CN105956740A (en) * | 2016-04-19 | 2016-09-21 | 北京深度时代科技有限公司 | Semantic risk calculating method based on text logical characteristic |
CN105844424A (en) * | 2016-05-30 | 2016-08-10 | 中国计量学院 | Product quality problem discovery and risk assessment method based on network comments |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111914064A (en) * | 2020-07-29 | 2020-11-10 | 王嘉兴 | Text mining method, device, equipment and medium |
CN112862305A (en) * | 2021-02-03 | 2021-05-28 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for determining risk state of object |
CN116910231A (en) * | 2023-09-11 | 2023-10-20 | 社治无忧(成都)智慧科技有限公司 | WeChat public opinion early warning method and system based on natural language processing |
CN116910231B (en) * | 2023-09-11 | 2023-11-17 | 社治无忧(成都)智慧科技有限公司 | WeChat public opinion early warning method and system based on natural language processing |
Also Published As
Publication number | Publication date |
---|---|
CN108256078B (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106383875B (en) | Man-machine interaction method and device based on artificial intelligence | |
CN107491534A (en) | Information processing method and device | |
CN107491547A (en) | Searching method and device based on artificial intelligence | |
CN107729319A (en) | Method and apparatus for output information | |
CN107295095A (en) | The method and apparatus for pushing and showing advertisement | |
CN109492772A (en) | The method and apparatus for generating information | |
CN104598218B (en) | For merging and reusing the method and system of gateway information | |
CN107679217A (en) | Association method for extracting content and device based on data mining | |
CN106407361A (en) | Method and device for pushing information based on artificial intelligence | |
CN108572990A (en) | Information-pushing method and device | |
CN108256078B (en) | Information acquisition method and device | |
WO2019231772A1 (en) | Systems and methods for crypto currency automated transaction flow detection | |
CN107634947A (en) | Limitation malice logs in or the method and apparatus of registration | |
CN107943895A (en) | Information-pushing method and device | |
CN107977678A (en) | Method and apparatus for output information | |
CN107169077A (en) | Method and apparatus for pushed information | |
CN107783962A (en) | Method and device for query statement | |
US20220292160A1 (en) | Automated system and method for creating structured data objects for a media-based electronic document | |
CN107807937A (en) | A kind of website SEO processing methods, apparatus and system | |
CN108540508A (en) | Method, apparatus and equipment for pushed information | |
CN112084342A (en) | Test question generation method and device, computer equipment and storage medium | |
CN107368489A (en) | A kind of information data processing method and device | |
CN107766498A (en) | Method and apparatus for generating information | |
CN111078849A (en) | Method and apparatus for outputting information | |
CN108959289B (en) | Website category acquisition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |