CN113704400A - False news identification method, device, equipment and chip - Google Patents

False news identification method, device, equipment and chip Download PDF

Info

Publication number
CN113704400A
CN113704400A CN202110940711.7A CN202110940711A CN113704400A CN 113704400 A CN113704400 A CN 113704400A CN 202110940711 A CN202110940711 A CN 202110940711A CN 113704400 A CN113704400 A CN 113704400A
Authority
CN
China
Prior art keywords
information
news
effective
acquiring
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110940711.7A
Other languages
Chinese (zh)
Other versions
CN113704400B (en
Inventor
支晓繁
薛利
赵博
王砚溱
李子烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Futures Information Technology Co ltd
Original Assignee
Shanghai Futures Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Futures Information Technology Co ltd filed Critical Shanghai Futures Information Technology Co ltd
Priority to CN202110940711.7A priority Critical patent/CN113704400B/en
Publication of CN113704400A publication Critical patent/CN113704400A/en
Application granted granted Critical
Publication of CN113704400B publication Critical patent/CN113704400B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a false news identification method and a false news identification device, which comprise the steps of obtaining first information to be identified and auxiliary information thereof; acquiring text contents of the first information and the auxiliary information; screening effective auxiliary information; acquiring characteristic information of the first information and the effective auxiliary information; and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information. According to the method and the device, the first information and the auxiliary information content are extracted, whether the first information is false news or not is determined according to the effective auxiliary information and the text content and/or the characteristic information of the first information, the diversity of data and characteristics is increased, the judgment accuracy is improved, and the data processing efficiency is improved.

Description

False news identification method, device, equipment and chip
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a false news identification method, a false news identification device, false news identification equipment and a false news identification chip.
Background
The traditional news is usually published in official newspapers and has official property and authoritative property, false news is usually transmitted in the form of channel messages, baldrics and the like, cannot be reserved in the form of information codes, and the transmission range is limited. The development of social media provides more channels for the dissemination of news and creates favorable conditions for the dissemination of false news.
With the development of the internet and information technology, the number of news platforms and social platforms is increasing. On one hand, the news information amount is increased, and on the other hand, people can freely express themselves in social media, draw knowledge and interact. Social networks, by virtue of their ease of speech and low cost of information distribution, while growing crowd wisdom, may also result in the proliferation and flooding of large amounts of spurious or unproven information.
The abuse of false news seriously affects people's life, transaction order, social stability and even national security, so that effective identification of false news is a problem to be solved in the current social background. How to quickly identify the credibility of news has become one of the major problems facing the current situation.
Disclosure of Invention
In order to solve the problems in the prior art, at least one embodiment of the present invention provides a method, an apparatus, a device, and a chip for identifying false news, which can overcome the defects in the prior art and improve the accuracy and efficiency of identifying false news.
In a first aspect, an embodiment of the present invention provides a false news identification method, including: acquiring first information to be identified and auxiliary information thereof; acquiring text contents of the first information and the auxiliary information; screening effective auxiliary information from the auxiliary information; acquiring the first information and the characteristic information of the effective auxiliary information according to the text content of the first information and the effective auxiliary information; and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information.
In some embodiments, the characteristic information is point of view characteristic information.
In some embodiments, determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the valid auxiliary information includes: acquiring a first relation between viewpoint characteristic information of first information and effective auxiliary information thereof; and determining whether the first information is false news according to the first relation.
In some embodiments, the method further comprises obtaining at least one second information related to the first information topic; acquiring text content of the second information, and acquiring viewpoint characteristic information of the second information according to the text content; determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the effective auxiliary information, wherein the determining comprises the following steps: acquiring a second relation between viewpoint characteristic information of the first information and viewpoint characteristic information of the second information; and determining whether the first information is false news according to the second relation.
In some embodiments, the screening of the auxiliary information for valid auxiliary information includes at least one of: screening effective auxiliary information according to the relevance of the auxiliary information and the first information topic; or acquiring the emotional intensity characteristics of the auxiliary information, and screening out effective auxiliary information according to the emotional intensity characteristics.
In some embodiments, the first information comprises: text information, picture information, video information, or audio information; acquiring the text content of the first information to be recognized includes extracting the text content from the text information, the picture information, the video information or the audio information of the first information.
In some embodiments, when the first information is video information, the obtaining the text content of the first information to be recognized includes: segmenting the video to obtain a single-frame image; carrying out duplicate removal on the single-frame image to obtain effective image information; and acquiring the text content of the effective image information.
In some embodiments, the method further comprises: acquiring user characteristic information, wherein the user characteristic information comprises: characteristic information of a first user publishing the first information or characteristic information of a second user publishing the auxiliary information; determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the effective auxiliary information, wherein the determining comprises the following steps: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the user characteristic information.
In some embodiments, the user characteristic information includes a user reliability indicator.
In some embodiments, the method further comprises: obtaining emotional feature information, wherein the emotional feature information comprises at least one of the following first information and/or auxiliary information: emotion classification, emotion intensity, emotion expression; determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the effective auxiliary information, wherein the determining comprises the following steps: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the emotional characteristic information.
In some embodiments, the method further comprises: acquiring propagation characteristic information, wherein the propagation characteristic information comprises forwarding characteristics of the first information and auxiliary information propagation characteristics; determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the effective auxiliary information, wherein the determining comprises the following steps: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the propagation characteristic information.
In some embodiments, the propagation characteristic information is obtained, where the propagation characteristic information includes a forwarding characteristic and an auxiliary information propagation characteristic of the first information, and specifically includes: acquiring the forwarding amount of the first information, and the user proportion of users who forward the first information, wherein the user reliability index is higher than a first preset value, and determining the forwarding characteristic of the first information; acquiring the auxiliary information quantity of the first information and the effective auxiliary information quantity of the first information, and determining the effective auxiliary information proportion; acquiring a user proportion that the user reliability index of a user publishing the first information auxiliary information is higher than a second preset value; and determining auxiliary information propagation characteristics according to the auxiliary information quantity, the effective auxiliary information proportion and the user proportion with the user reliability index higher than a second preset value.
In a second aspect, an embodiment of the present invention further provides a false news identification apparatus, including: the first acquisition module is used for acquiring first information to be identified and auxiliary information thereof; the second acquisition module is used for acquiring the text content of the first information and the auxiliary information; the screening module is used for screening effective auxiliary information from the auxiliary information; the third acquisition module is used for acquiring the first information and the characteristic information of the effective auxiliary information according to the text content of the first information and the effective auxiliary information; and the determining module is used for determining whether the first information is false news or not according to the first information and the text content and/or the characteristic information of the effective auxiliary information.
In some embodiments, the characteristic information is point of view characteristic information, and the determining module includes: the system comprises a first acquisition unit and a first determination unit, wherein the first acquisition unit is used for acquiring a first relation between the first information and viewpoint characteristic information of effective auxiliary information of the first information; and the first determining unit is used for determining whether the first information is false news according to the first relation.
In some embodiments, the apparatus further comprises a fourth obtaining module, the fifth obtaining module: the fourth acquisition module is used for acquiring at least one piece of second information related to the first information topic; the fifth acquisition module is used for acquiring the text content of the second information and acquiring viewpoint characteristic information of the second information according to the text content; the determining module comprises a second acquiring unit and a second determining unit, wherein the second acquiring unit is used for acquiring a second relation between the viewpoint characteristic information of the first information and the viewpoint characteristic information of the second information; and the second determining unit is used for determining whether the first information is false news according to the second relation.
In some embodiments, the apparatus further comprises a screening module, wherein the screening module comprises at least one of: the first screening unit is used for screening effective auxiliary information according to the correlation degree of the auxiliary information and the first information topic; or the second screening unit is used for acquiring the emotional intensity characteristics of the auxiliary information and screening the effective auxiliary information according to the emotional intensity characteristics.
In some embodiments, the first information comprises: text information, picture information, video information or audio information, when the first information is video information, the second acquisition module includes: the segmentation unit is used for segmenting the video to obtain a single-frame image; the duplication removing unit is used for removing duplication of the single-frame image to obtain effective image information; and a third acquisition unit for acquiring the text content of the effective image information.
In some embodiments, the apparatus further comprises: a sixth obtaining module, configured to obtain user characteristic information, where the user characteristic information includes: characteristic information of a first user publishing the first information or characteristic information of a second user publishing the auxiliary information; and the determining module is specifically used for determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the user characteristic information.
In some embodiments, the apparatus further comprises: a seventh obtaining module, configured to obtain emotional feature information, where the emotional feature information includes at least one of the following first information and/or auxiliary information: and the emotion type, the emotion intensity and the emotion expression determining module are specifically used for determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the emotion characteristic information.
In some embodiments, the apparatus further comprises: an eighth obtaining module, configured to obtain propagation feature information, where the propagation feature information includes a forwarding feature and an auxiliary information feature of the first information; and the determining module is specifically used for determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the propagation characteristic information.
In a third aspect, an embodiment of the present invention further provides a false news identification method, including: acquiring news to be identified, acquiring the content of the news to be identified, and acquiring the viewpoint of the news to be identified according to the content of the news to be identified; obtaining comments of news to be identified, obtaining the contents of the comments, selecting effective comments according to the contents of the comments, and obtaining the viewpoints of the effective comments according to the contents of the effective comments; and identifying whether the news to be identified is false news or not according to the relation between the viewpoint of the news to be identified and the viewpoint of the effective comment.
In some embodiments, the method further comprises: determining a topic of news to be identified, acquiring at least one comparison news related to the topic, and acquiring the content and the viewpoint of the comparison news; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the relationship between the viewpoint of the news to be identified and the viewpoint of the comparison news.
In some embodiments, the method further comprises: the method comprises the steps of obtaining a publisher of news to be identified, obtaining user characteristics of the publisher, obtaining the publisher of effective comments, and obtaining user characteristics of reviewers; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the characteristics of the publisher user and the characteristics of the reviewer user.
In some embodiments, the reviewer user characteristics include a reviewer user identification false news capability indicator, the identification false news capability indicator being associated with at least one of the following factors: the method comprises the following steps of obtaining a ratio of a historical review viewpoint of a user to a true news viewpoint, a ratio of a historical review viewpoint of a user to a false news viewpoint, a ratio of a historical review viewpoint of a user to a true news viewpoint, and a ratio of a historical review viewpoint of a user to a false news viewpoint; alternatively, the publisher user characteristics include a publisher user reliability indicator, the publisher user reliability indicator being related to at least one of the following factors: the method comprises the steps that the ratio of the real news of user historical release to the total news of the release, the ratio of the real news of IP address historical release to the total news of the release, which is adopted by the user to release the news to be identified, to the total news of the release, and the ratio of the real news of the historical release to the total news of the release in the time period when the user releases the news to be identified.
In some embodiments, the method further comprises: obtaining the emotional characteristics of news and comments to be identified, wherein the emotional characteristics of the comments are related to at least one of the following factors: emotion classification, emotion intensity, emotion expression; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the emotional characteristics of the news to be identified and the comment.
In some embodiments, the method further comprises: acquiring the propagation characteristics of news to be identified, wherein the propagation characteristics are related to at least one of the following factors: the forwarding amount of the news to be identified, the user reliability index of the user forwarding the news to be identified are higher than the user proportion of a first preset value, the proportions of all comments are effectively commented, and the user reliability index of a reviewer commenting the news to be identified is higher than the user proportion of a second preset value; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the propagation characteristics of the news to be identified.
In some embodiments, selecting valid reviews from the content of the reviews comprises: and selecting effective comments according to the topic relevance of the comments and the news to be identified and/or the emotional intensity of the comments.
In some embodiments, the presentation of the news is in the form of: text news, picture news, video news, or audio news.
In a fourth aspect, an embodiment of the present invention further provides a false news identification apparatus, including: the ninth acquisition module is used for acquiring news to be identified, acquiring the content of the news to be identified and acquiring the viewpoint of the news to be identified according to the content of the news to be identified; the tenth acquisition module is used for acquiring comments of news to be identified, acquiring the contents of the comments, selecting effective comments according to the contents of the comments, and acquiring the viewpoints of the effective comments according to the contents of the effective comments; and the identification module is used for identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment.
In some embodiments, the apparatus further comprises: the eleventh acquisition module is used for determining a topic of news to be identified, acquiring at least one comparison news related to the topic, and acquiring the content and the viewpoint of the comparison news; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the relationship between the viewpoint of the news to be identified and the viewpoint of the comparison news.
In some embodiments, the device further includes a twelfth obtaining module, configured to obtain a publisher of the news to be identified, obtain publisher user characteristics, obtain a publisher of the effective comment, and obtain reviewer user characteristics; the identification module is specifically configured to: and identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the characteristics of the publisher user and the characteristics of the reviewer user.
In some embodiments, the apparatus further comprises: the system further comprises a thirteenth acquisition module, wherein the thirteenth acquisition module is used for acquiring the emotional characteristics of the news and the comments to be identified, and the emotional characteristics of the comments are related to at least one of the following factors: emotion classification, emotion intensity, emotion expression; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the emotional characteristics of the news to be identified and the comment.
In some embodiments, the apparatus further comprises: the system further comprises a fourteenth acquisition module, wherein the fourteenth acquisition module is used for acquiring the propagation characteristics of the news to be identified, and the propagation characteristics are related to at least one of the following factors: the forwarding amount of the news to be identified, the user reliability index of the user forwarding the news to be identified are higher than the user proportion of a first preset value, the proportions of all comments are effectively commented, and the user reliability index of a reviewer commenting the news to be identified is higher than the user proportion of a second preset value; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the propagation characteristics of the news to be identified.
In a fifth aspect, an embodiment of the present invention further provides a false news identification device, including: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of any of the first or third aspects above to be implemented.
In a sixth aspect, an embodiment of the present invention further provides a chip, configured to perform the method in the first aspect. Specifically, the chip includes: a processor for calling and running the computer program from the memory so that the device on which the chip is installed is used for executing the method of the first aspect or the third aspect.
In a seventh aspect, the present invention also provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method according to any one of the first aspect or the third aspect.
In an eighth aspect, the present invention also provides a computer program product, which includes computer program instructions, and the computer program instructions make a computer execute the method in the first aspect or the third aspect.
Therefore, the false news identification method and the false news identification device in the embodiment of the invention extract the effective auxiliary information from the auxiliary information by extracting the first information serving as the news body and the content of the auxiliary information, and simultaneously determine whether the first information is false news according to the effective auxiliary information and the text content and/or the characteristic information of the first information, so that the diversity of data and characteristics is increased, the judgment accuracy is improved, the effective auxiliary information is extracted by screening the auxiliary information in the embodiment of the invention, the judgment accuracy is further improved, and the data processing efficiency is improved on the basis of increasing the accuracy.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a flow chart of an embodiment of a false news identification method of the present application;
FIG. 2 is a schematic diagram of a framework of an embodiment of a false news identification method according to the present application;
fig. 3 is a schematic structural diagram of an embodiment of a false news identification device according to the present application.
Detailed description of the preferred embodiments
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
It is noted that, in the embodiments of the present application, relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Fig. 1 is a flowchart of an embodiment of a false news identification method, and as shown in fig. 1, in a first aspect, an embodiment of the present application provides a false news identification method, including:
step 110, acquiring first information to be identified and auxiliary information thereof.
In this step, the first information to be recognized may be news. In implementation, news to be identified can be obtained through mainstream media, and can also be obtained through public social media or external public opinion networks and other media. The main mainstream media and social media may include chinese media and foreign language media, which is not limited in this application.
Due to the promotion of the development of science and technology, the media are more and more diverse in form, and the release and the propagation mode of news are more diversified due to the rise of the live broadcast and small video industries. Meanwhile, the way in which the general public participates in the discussion of news is increasing, for example, forwarding the news, commenting the news under the news, sending a barrage in video news, performing voice comment, and the like. In the present application, such information that is not a news ontology but is associated with the news ontology is referred to as auxiliary information. Comments and barrage are taken as examples of auxiliary information in the application, but with the development of science and technology, new information related to news may appear in the future, and the auxiliary information of the application also comprises auxiliary information which is generated in a new form under the future technology.
In this step, the first information to be identified and the auxiliary information thereof may be implemented by various existing and future technologies, such as web crawler technology, and the like, which is not limited in this application.
Step 120, obtaining the text content of the first information and the auxiliary information.
The text content of the first information and its auxiliary information acquired in step 110 is acquired in this step. Regardless of the form of the carrier of the news and its auxiliary information as the first information, both of them are converted into text data in this step.
It is understood that news may be text-based, image-based, video-based, audio-based, or other forms based on future technologies, and therefore, the first information may be text information, picture information, video information, audio information, and the like, and accordingly, the auxiliary information may also be text information, picture information, video information, or audio information, which is not limited in this application.
In this step, text content is extracted from the first information or the auxiliary information in any form. In particular, the existing technology can be adopted for text extraction. For example, text extraction may be performed by using an optical text recognition method, which is generally divided into two steps, i.e., text positioning and text recognition.
For example, the CTPN model can be adopted for text positioning, the spatial features and the temporal features of the scene text are extracted at the same time, the spatial features are beneficial to improving the detection accuracy of small text segments, and the sequence features can capture the relation between the small text segments.
After positioning, because the text is different in length, a connected Current Neural Network (CRNN) model can be adopted to convert text recognition into time sequence dependent sequence learning for text recognition.
For example, when the first information or the side information is speech information, a Connectionist Temporal Classification (CTC) Classification model may be employed as the speech recognition model. For example, an end-to-end RNN speech recognition method based on the CTC method is employed.
When the first information or the auxiliary information thereof is video information, the corresponding text content thereof can be obtained by the following steps: segmenting the video to obtain a single-frame image; removing the duplication of the single-frame image to obtain effective image information; and acquiring the text content of the effective image information.
Step 130, screening out effective auxiliary information from the auxiliary information.
In this step, effective auxiliary information is screened out from the auxiliary information. It should be understood that the related information other than the news ontology, which is not the first information, is referred to as auxiliary information in the present application, such as comments under the news and barrage in videos, or characters, pictures, videos, or pictures appearing in the comment area, which are all auxiliary information of the first information.
However, research finds that many auxiliary information cannot be used as a basis for false news identification, for example, "first", "rush to the sofa", and the like are only used for searching for the auxiliary information which is not related to the news topic but is in existence in the network. In the present application, the auxiliary information that can be used as a basis for identifying false news is referred to as valid auxiliary information, and examples thereof include auxiliary information related to topics and auxiliary information having a certain emotional intensity. In this step, effective auxiliary information is screened out from the auxiliary information according to the content of the auxiliary information or the characteristics of the text content obtained by the auxiliary information.
Step 140, obtaining the feature information of the first information and the effective auxiliary information according to the text content of the first information and the effective auxiliary information.
In this step, feature information of the first information and the valid auxiliary information is extracted so as to determine whether the first information is false news or not through the feature information. It is understood that the feature information includes various feature information of the first information and the effective auxiliary information, such as a viewpoint feature, an emotional feature, a user feature, a propagation feature, and the like, and may also include other features that can compare the first information with other related news (also referred to as a topic) related to the topic, and characterize the effective auxiliary information. In general, the characteristic information can characterize the first information and the valid side information. It is understood that the respective characteristic information thereof may be acquired by the text contents of the first information and the effective auxiliary information.
And 150, determining whether the first information is false news according to the text content and/or the characteristic information of the first information and the effective auxiliary information.
Specifically, whether the first information is false news or not can be determined according to the text content of the first information and the effective auxiliary information thereof, or whether the first information is false news or not can be determined according to the first information and the effective auxiliary information characteristic information thereof, or whether the first information is false news or not can be determined according to the text content and the characteristic information of the first information and the effective auxiliary information thereof.
According to the embodiment of the application, effective auxiliary information is extracted from the auxiliary information through simultaneously extracting the first information serving as a news body and the content of the auxiliary information, whether the first information is false news is determined according to the effective auxiliary information and the text content and/or the characteristic information of the first information, whether the first information is false news is judged by additionally adding the content of the auxiliary information of the first information, the diversity of data and characteristics is increased, the judgment accuracy is improved, in addition, the auxiliary information is screened and extracted to obtain the effective auxiliary information, the judgment accuracy is further improved, and meanwhile, the data processing efficiency is improved on the basis of increasing the accuracy.
Further, in the false news identification method, the characteristic information is viewpoint characteristic information. Specifically, the viewpoint feature may include a viewpoint feature of the news body which is the first information, or may include a viewpoint feature of the effective auxiliary information of the first information. When the effective auxiliary information is an effective comment, the effective auxiliary information viewpoint feature is an effective comment viewpoint feature. The embodiment of the application provides clues for judging whether news is true or false by mining the relation between the news viewpoint and the effective comment viewpoint.
At this time, determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information specifically includes: acquiring a first relation between viewpoint characteristic information of first information and effective auxiliary information thereof; and determining whether the first information is false news according to the first relation.
Further, the false news identification method of the application further comprises the following steps: acquiring at least one piece of second information related to the first information topic; acquiring text content of the second information, and acquiring viewpoint characteristic information of the second information according to the text content; at this time, determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes: acquiring a second relation between viewpoint characteristic information of the first information and viewpoint characteristic information of the second information; and determining whether the first information is false news according to the second relation.
It is understood that there are at least one or more second information, the inter-topic news viewpoint feature is introduced in the step, and whether the first information is false news is identified according to the second relation of the inter-topic news viewpoint feature.
The relationship between a news viewpoint and its effective comment viewpoint, and the relationship between a news viewpoint and a news viewpoint on the same topic belong to the relationship between viewpoints. Relationships between views can be divided into three categories, support, neutral and contradiction, where contradictions can be further divided into inconsistency, antisense and negation. The relationship between perspectives can also be divided into: supporting point, neutral point and opposing point.
For example, the relationships between viewpoints can be expressed using a viewpoint relationship label L, the value of L for supporting viewpoints is 1, the value of L for neutral viewpoints is 0, and the value of L for anti-viewpoints is-1. For a specific news (i.e. first information) to be identified, N is usediThe news set representing the news and having the same topic as the news is { N1,N2,……NmThe comment set of the news to be identified is { C }i1,Ci2,,,,,CinIs denoted by PCikRepresenting the probability of the opinion relationship between the news and its comments, denoted PNjThe probability of the view relationship between the news and other news on the same topic is shown, and the probability P of the view relationship between the news and the commentsCikCorresponding to LjThe weighted average of the values can obtain the comment viewpoint consistency score of the news to be identified, and the score is SCExpressing the probability P of the view relationship between news and news on the same topicNjCorresponding to LjThe weighted average of the values may obtain the inter-news view consistency score for the news to be identified, as SNAnd (4) expressing.
In addition, the comment proportion supporting the view of the news to be identified in the comments of the news to be identified can be obtained, and R is usedSdiRepresenting and obtaining the proportion of the news on the same topic as the news to be identified, which supports the view of the news to be identified, in RSfiShowing that correspondingly, the comment proportion against the view of the news to be identified in the comments of the news to be identified can also be obtained, and R is usedOdiRepresenting and obtaining a proportion of the news on the same topic as the news to be identified against the view of the news to be identified, in ROfiAnd (4) showing.
Finally, according to the obtained comment viewpoint consistency score SCInter-news viewpoint consistency scoring SNComment scale R supporting its viewSdiComment ratio R against its viewOdi、News scale R supporting its viewSfiNews ratio R of its viewpointOfiAnd constructing the viewpoint characteristics of the news to be identified, and taking the viewpoint characteristics as a factor for identifying whether the news to be identified is false news.
Specifically, extracting the opinion characteristics of the news or the comments can be performed by various existing and future technologies, such as extracting a triple extraction template based on dependency syntax or a pair of triples of the news and the comments, calculating the similarity between the triples through a triple alignment module, and then performing contradiction/opinion-supporting characteristic extraction on the suspected triples.
For example, an existing pyltp tool can be selected, semantic information of each piece of text is represented as a plurality of triples, and the triples between sentences are aligned to further find suspected contradictory triples. And predicting the relationship between two news viewpoints of the same topic by using a BilSTM contradiction prediction model to obtain a viewpoint consistency score between the news and other news of the same topic, and judging whether the first information serving as the news is false news or not according to the consistency score. The details of the specific tools used are not described in this application.
It is understood that the first information in the present application refers to a certain news to be identified, and in practical operation, the news to be identified may be defined as the first information, and the corresponding other news on the same topic is the second information.
Further, the method of the present application screens out valid auxiliary information from the auxiliary information, which includes at least one of the following: screening effective auxiliary information according to the relevance of the auxiliary information and the first information topic; or acquiring the emotional intensity characteristics of the auxiliary information, and screening out effective auxiliary information according to the emotional intensity characteristics.
For example, effective auxiliary information is screened out according to the correlation degree between the auxiliary information and the topic of the first information, and effective auxiliary information is screened out according to the topic correlation degree between the auxiliary information and the first information. Specifically, the auxiliary information having a higher degree of correlation with the topic is more likely to be effective auxiliary information, and the auxiliary information having a lower degree of correlation with the topic is more likely not to be effective auxiliary information. In particular, various existing and future models, such as the ESIM model, may be used, and the present application is not limited thereto.
The emotional intensity characteristics of the auxiliary information can also be acquired, effective auxiliary information is screened out according to the emotional intensity characteristics, and the effective auxiliary information is screened out according to the emotional intensity characteristics of the auxiliary information. Specifically, the more distinct the emotional intensity characteristics, the more likely the side information is to be effective side information.
In the embodiment of the application, the auxiliary information, the first information topic relevance and the emotional intensity characteristic of the auxiliary information can be combined to form an effective auxiliary information set, and the combination of the auxiliary information, the first information topic relevance and the emotional intensity characteristic can further provide more accurate decision support for identifying false news.
Further in the method of the present application, the first information includes: text information, picture information, video information, or audio information; acquiring the text content of the first information to be recognized includes extracting the text content from the text information, the picture information, the video information or the audio information of the first information.
The method and the device consider that all news carriers possibly contain text information in all directions and in multiple angles, and extract the text information in different data types by adopting different models.
Further, when the first information is video information, acquiring text content of the first information to be identified includes: segmenting the video to obtain a single-frame image; carrying out duplicate removal on the single-frame image to obtain effective image information; and acquiring the text content of the effective image information.
It is understood that a video is a sequence of images consisting of one still picture, and each still image that makes up the video is referred to as a "frame". Therefore, aiming at the video news media, the original video file is segmented by adopting a preset time window which can be a fixed time window, and a single-frame image in a unit time window is obtained.
In some cases, for example, in the situations of live broadcasting, screen recording and the like, the possibility that the pictures in the video are unchanged for a long time exists, so that the embodiment of the application performs deduplication work on all acquired single-frame images, retains the images of all different pictures, and ensures the non-redundancy and effectiveness of information.
The picture data acquired based on the video and the news data using the picture as a transmission medium form a picture data set.
When text content is extracted, the embodiment may introduce a deep learning CTPN model based on an OCR technology, scan a picture by using a VGG-16 network, convolve a feature map of the last convolution layer conv _5 by using, for example, a convolution of 3 × 3, put the obtained feature vector into Bi-LSTM learning, and finally classify and regress features by using a full connection layer FC to obtain all effective text positions. And then extracting the convolution characteristic diagram of the effective text region obtained by each block by adopting a CRNN model in deep learning, putting the convolution characteristic diagram into a Bi-LSTM, and inputting the final output of the Bi-LSTM into a transcription layer to perform long and short character recognition to obtain a final character output sequence.
The OCR model of CTPN + CRNN ensures that all text information in the original non-repeated picture can be fully reserved, is not limited by the character length of the language, and has strong recognition capability for the text with indefinite length. Or by using other models, which are not limited in this application.
Further, the false news identification method of the application further comprises the following steps: acquiring user characteristic information, wherein the user characteristic information comprises: characteristic information of a first user publishing the first information or characteristic information of a second user publishing the auxiliary information; at this time, determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information specifically includes: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the user characteristic information.
It is understood that the user characteristics may include user base characteristics and user statistical characteristics. The user basic features may include, for example, public user basic information, and the user statistical features may include, for example, reliability evaluation indexes. The user statistical characteristics may be based on statistical characteristics of publishers and statistical characteristics of reviewers according to whether the user is a publisher or a reviewer. The user characteristics can reflect the habits and the reliability of the user to a certain extent. It should be noted that, in the present application, the user characteristics may include characteristic information of a first user who publishes the first information, and characteristic information of a second user who publishes the auxiliary information (or characteristic information of the second user who publishes the valid auxiliary information).
Specifically, the basic characteristics of the user may include the user basic information such as disclosure of the user ID, the time when the user publishes the first information or publishes the supplementary information, the location (IP address), and the like. The user ID is the unique identification of the reviewer. Taking the user to make comments as an example, the time and place characteristics as the basic characteristics of the user represent the time and place of making comments by the user. According to experience, news published at abnormal time and place is more likely to be false news, and comments published at abnormal time are likely to be the speech of the water army for flaring public opinion. Specifically, obtaining the base characteristics of the user may include several factors: the user ID of the first information (i.e., news to be identified) distribution, the news distribution time, the news distribution place, the auxiliary information distributor ID, and the auxiliary information distribution time.
User statistics may utilize statistical models to reflect the trustworthiness of past behavior of publishers or reviewers. For example, when the user is the first user who posts the first information (i.e. news to be identified), the publisher-based statistical characteristics include a ratio of true news published by the news publisher to total news published, a ratio of true news to total news in the same time period, and a ratio of true news to total news in the same place. When the user is a second user publishing the auxiliary information, the statistical characteristics of the second user comprise a valid comment ratio, a ratio of the comment to be consistent with a true news viewpoint, a ratio of the comment to be opposite to the true news viewpoint, a ratio of the comment to be consistent with a false news viewpoint, a ratio of the comment to be opposite to the false news viewpoint and the like.
When the user is a second user (i.e., a user who posts auxiliary information), the embodiment of the present application determines, according to the statistical characteristics of the second user, that the reliability index of the second user may be specifically related to the following variables: a ratio of reviews to true news opinions a1, a ratio of reviews to false news opinions a2, a ratio of reviews to true news opinions B1, and a ratio of reviews to false news opinions B2. Wherein the ratio of review to true news perspective a1 and the ratio of review to false news perspective a2 is greater in value indicating that the second user has a greater ability to identify false news, and the ratio of review to true news perspective B1, the ratio of review to false news perspective B2 is less in value indicating that the second user has a greater ability to identify false news. Specifically, the second user reliability index may be formed by taking the logarithm of (a1+ a2)/(B1+ B2), and then using the value after taking the logarithm and the effective comment ratio of the second user.
According to the embodiment of the application, the reliability indexes of the first user and the second user are determined through data, and whether the first information to be identified is false news or not is determined by combining the user reliability indexes of a news publisher and an auxiliary information publisher, so that the accuracy and the efficiency of identification are further improved.
Further, the false news identification method of the application further comprises the following steps: obtaining emotional feature information, wherein the emotional feature information comprises at least one of the following first information and/or auxiliary information: emotion classification, emotion intensity, emotion expression; at this time, determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the emotional characteristic information.
The following is specifically described in three aspects of emotion classification, emotion intensity and emotion expression:
emotion classification: since the number of true news is much greater than the number of false news for positive and neutral sentiments. While fake news often contains negative emotions in order to solicit the approval and sympathy of others, and comments thereof often contain questionable emotions.
The embodiments of the present application analyze the emotion types of news and comments (hereinafter, auxiliary information is referred to as "comment" for convenience of explanation). And capturing emotion category characteristics based on a multi-classification emotion analysis model (BILSTM), and classifying the emotion categories into positive emotions, negative emotions and neutral emotions.
Emotional intensity: the purpose of false news is to be spurious and it is desirable to make it easier for the reader to trust his opinion by means of a telephone operation. Based on the analysis of the actual data set, the inventors found that the emotional intensity of the false news is stronger than that of the real news, so as to induce a strong emotion to the reader.
Therefore, the method analyzes the emotional intensity of news, specifically, the method can calculate the emotional intensity characteristic based on the Softmax function, the emotional intensity characteristic value can be a continuous value from 0 to 1, and the larger the numerical value is, the stronger the emotional intensity is.
And (3) emotion expression: false news generally uses an exaggerated, an animated and a controversial vocabulary to make the content thereof more popular, while true news mostly uses plain words to express the content of the news. Therefore, emotion expression feature mining is helpful for identification of false news.
According to the method, the emotion expression words in the news are extracted, and the frequency C of occurrence of each emotion expression word in real news and the frequency D of occurrence of each emotion expression word in false news are calculated respectively on the basis of a statistical model. The greater the frequency of occurrence of the emotional expression word in the real news and the fake news, or the lower the frequency of occurrence in the real news and the fake news, the closer the value of the emotional expression feature value to 0, the greater the frequency of occurrence of the emotional expression word in the real news and the lower the frequency of occurrence in the fake news, the closer the value of the emotional expression feature value to 1, and conversely, the lower the frequency of occurrence of the emotional expression word in the real news and the higher the frequency of occurrence in the fake news, the closer the value of the emotional expression feature value to-1.
The emotion characteristics are comprehensively obtained through extracting three characteristics, namely emotion category characteristics, emotion intensity characteristics and emotion expression characteristics.
According to the embodiment, whether the news is false news or not is comprehensively judged by combining the first information and/or the emotional characteristic information of the auxiliary information, and the judgment accuracy is further improved.
Further, the false news identification method of the application further comprises the following steps: acquiring propagation characteristic information, wherein the propagation characteristic information comprises forwarding characteristics of the first information and auxiliary information propagation characteristics; at this time, determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes: and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the propagation characteristic information.
Specifically, the propagation characteristic information includes a forwarding characteristic of the first information and an auxiliary information propagation characteristic. For example, the forwarding characteristic of the first information is determined by acquiring the forwarding amount of the first information and the user ratio of the user reliability index of the user forwarding the first information, which is higher than a first preset value. Determining the propagation characteristics of the side information may include: acquiring the auxiliary information quantity of the first information and the effective auxiliary information quantity of the first information, and determining the effective auxiliary information proportion; acquiring a user proportion that the user reliability index of a user publishing the first information auxiliary information is higher than a second preset value; and finally, determining the auxiliary information propagation characteristics according to the auxiliary information quantity, the effective auxiliary information proportion and the user proportion with the user reliability index higher than a second preset value.
Another specific example may be: and acquiring the proportion of reliable users in the first information users, wherein the propagation comprises forwarding and publishing the auxiliary information. Specifically, the method comprises the step of obtaining a user reliability index. At the moment, the forwarding characteristic of the first information comprises a user ratio that the user reliability index of a user forwarding the first information is higher than a first preset value; the auxiliary information propagation characteristics comprise user proportion that the user reliability index of an auxiliary information user who releases the first information is higher than a second preset value.
Since the influence of the news is reflected by the propagation characteristics such as the forwarding amount, the comment amount, the search amount, and the like, in the present application, the propagation characteristics are reflected by the news, that is, the forwarding amount of the first information, the comment amount, the forwarding amount of the reliable user, the comment amount (for convenience of description and understanding, the auxiliary information is referred to as a comment), and the like.
In this embodiment, the propagation characteristics may be divided into forwarding characteristics and comment characteristics, where the forwarding characteristics include forwarding amount and forwarding amount of high-quality users, and the comment characteristics (auxiliary information propagation characteristics) include the number of comments of news, the number and proportion of valid comments, the number and proportion of comments of high-quality users, and the like. Here, the high quality user may be the above-mentioned user with high reliability, i.e., a user with a high reliability index.
According to the method and the device, on the basis of the forwarding number and the comment number of the to-be-identified first information, namely the to-be-identified news, the propagation characteristics of high-quality users are further mined based on the user reliability indexes, the propagation characteristics of the news are constructed in an all-dimensional mode, and the accuracy of false news identification is improved.
The above embodiments of the present application can be combined with each other to form various specific embodiments, which are not listed for convenience of description. The application provides a multi-layer false news identification framework based on multivariate data feature mining. Fig. 2 is a schematic frame diagram of an embodiment of the false news identification method, and as shown in fig. 2, first, diverse news data of different channels are collected at a bottom layer and converted into a text form, and meanwhile, topic relevance and emotional intensity of news comments are analyzed to extract effective comments from the news comments. Second, the middle layer is responsible for mining perspective features, emotion features, user features, and propagation features in the text. Specifically, the viewpoint features include mutual support or opposition features of viewpoints among news and between news and comments; the emotional characteristics comprise three aspects of emotional category, emotional intensity and emotional expression; the user characteristics comprise public user basic information and reliability evaluation indexes; the propagation characteristics comprise forwarding amount, appraisal amount, forwarding amount of reliable users, appraisal amount and the like. And the top layer obtains the identification result of the false news based on the BRET model.
Fig. 3 is a schematic structural diagram of an embodiment of the false news recognition apparatus of the present application, and as shown in fig. 3, an embodiment of the false news recognition apparatus 300 of the present application includes: the system comprises a first obtaining module 301, a second obtaining module 302, a screening module 303, a third obtaining module 304 and a determining module 305, wherein the first obtaining module 301 is used for obtaining first information to be identified and auxiliary information thereof; a second obtaining module 302, configured to obtain text contents of the first information and the auxiliary information obtained by the first obtaining module 301; a screening module 303, configured to screen out effective auxiliary information from the auxiliary information acquired by the second acquiring module 302; a third obtaining module 304, configured to obtain feature information of the first information and the effective auxiliary information according to the text content of the first information and the effective auxiliary information obtained by the second obtaining module 302; a determining module 305, configured to determine whether the first information is false news according to the text content and/or feature information of the first information and the valid auxiliary information acquired by the third acquiring module 304.
The operations executed by the modules of the false news identification device in this embodiment can be specifically referred to the method corresponding to fig. 1, so as to achieve the same technical effect.
According to the embodiment of the application, effective auxiliary information is extracted from the auxiliary information through simultaneously extracting the first information serving as a news body and the content of the auxiliary information, whether the first information is false news is determined according to the effective auxiliary information and the text content and/or the characteristic information of the first information, whether the first information is false news is judged by additionally adding the content of the auxiliary information of the first information, the diversity of data and characteristics is increased, the judgment accuracy is improved, in addition, the auxiliary information is screened and extracted to obtain the effective auxiliary information, the judgment accuracy is further improved, and meanwhile, the data processing efficiency is improved on the basis of increasing the accuracy.
Further, in the false news recognition apparatus of the present application, the feature information acquired by the third acquiring module 304 is viewpoint feature information, and the determining module 305 includes: the system comprises a first acquisition unit and a first determination unit, wherein the first acquisition unit is used for acquiring a first relation between the first information and viewpoint characteristic information of effective auxiliary information of the first information; and the first determining unit is used for determining whether the first information is false news according to the first relation.
Further, the false news identification device of the application further comprises a fourth acquisition module, and the fifth acquisition module: the fourth obtaining module is used for obtaining at least one piece of second information related to the first information topic; the fifth acquisition module is used for acquiring the text content of the second information and acquiring viewpoint characteristic information of the second information according to the text content; the determining module comprises a second acquiring unit and a second determining unit, wherein the second acquiring unit is used for acquiring a second relation between the viewpoint characteristic information of the first information and the viewpoint characteristic information of the second information; and the second determining unit is used for determining whether the first information is false news according to the second relation.
Further, the screening module 303 of the false news identification apparatus of the present application includes at least one of: the first screening unit is used for screening effective auxiliary information according to the correlation degree of the auxiliary information and the first information topic; or the second screening unit is used for acquiring the emotional intensity characteristics of the auxiliary information and screening the effective auxiliary information according to the emotional intensity characteristics.
Further, in the false news recognition apparatus of the present application, the first information acquired by the first acquiring module 301 includes: text information, picture information, video information, or audio information, and when the first information is video information, the second obtaining module 302 includes: the segmentation unit is used for segmenting the video to obtain a single-frame image; the duplication removing unit is used for removing duplication of the single-frame image to obtain effective image information; and a third acquisition unit for acquiring the text content of the effective image information.
Further, the false news identification device of the present application further includes a sixth obtaining module, configured to obtain user characteristic information, where the user characteristic information includes: characteristic information of a first user publishing the first information or characteristic information of a second user publishing the auxiliary information; at this time, the determining module is specifically configured to determine whether the first information is false news according to the text content and/or the feature information of the first information and the valid auxiliary information, and the user feature information.
Further, the false news identification device of the present application further includes a seventh obtaining module, configured to obtain emotional characteristic information, where the emotional characteristic information includes at least one of the following first information and/or auxiliary information: the determining module is specifically used for determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the emotional characteristic information.
Further, the false news identification device further comprises an eighth obtaining module, configured to obtain propagation characteristic information, where the propagation characteristic information includes a forwarding characteristic and an auxiliary information characteristic of the first information; at this time, the determining module is specifically configured to determine whether the first information is false news according to the text content and/or the feature information of the first information and the valid auxiliary information, and the propagation feature information.
Specifically, the forwarding characteristic of the first information may include factors such as a forwarding amount of the first information, and a user ratio at which a user reliability index of a user forwarding the first information is higher than a first preset value; the side information propagation characteristics of the first information may include the following factors: the auxiliary information quantity of the first information, the effective auxiliary information proportion, the user reliability index of a user publishing the auxiliary information of the first information, the user proportion of a second preset value and the like.
In a third aspect, an embodiment of the present invention further provides a false news identification method, including: acquiring news to be identified, acquiring the content of the news to be identified, and acquiring the viewpoint of the news to be identified according to the content of the news to be identified; obtaining comments of news to be identified, obtaining the contents of the comments, selecting effective comments according to the contents of the comments, and obtaining the viewpoints of the effective comments according to the contents of the effective comments; and identifying whether the news to be identified is false news or not according to the relation between the viewpoint of the news to be identified and the viewpoint of the effective comment.
In some embodiments, the method further comprises: determining a topic of news to be identified, acquiring at least one comparison news related to the topic, and acquiring the content and the viewpoint of the comparison news; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the relationship between the viewpoint of the news to be identified and the viewpoint of the comparison news.
In some embodiments, the method further comprises: the method comprises the steps of obtaining a publisher of news to be identified, obtaining user characteristics of the publisher, obtaining the publisher of effective comments, and obtaining user characteristics of reviewers; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the characteristics of the publisher user and the characteristics of the reviewer user.
In some embodiments, the reviewer user characteristics include a reviewer user identification false news capability indicator, the identification false news capability indicator being associated with at least one of the following factors: the method comprises the following steps of obtaining a ratio of a historical review viewpoint of a user to a true news viewpoint, a ratio of a historical review viewpoint of a user to a false news viewpoint, a ratio of a historical review viewpoint of a user to a true news viewpoint, and a ratio of a historical review viewpoint of a user to a false news viewpoint; alternatively, the publisher user characteristics include a publisher user reliability indicator, the publisher user reliability indicator being related to at least one of the following factors: the method comprises the steps that the ratio of the real news of user historical release to the total news of the release, the ratio of the real news of IP address historical release to the total news of the release, which is adopted by the user to release the news to be identified, to the total news of the release, and the ratio of the real news of the historical release to the total news of the release in the time period when the user releases the news to be identified.
In some embodiments, the method further comprises: obtaining the emotional characteristics of news and comments to be identified, wherein the emotional characteristics of the comments are related to at least one of the following factors: emotion classification, emotion intensity, emotion expression; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the emotional characteristics of the news to be identified and the comment.
In some embodiments, the method further comprises: acquiring the propagation characteristics of news to be identified, wherein the propagation characteristics are related to at least one of the following factors: the forwarding amount of the news to be identified, the user reliability index of the user forwarding the news to be identified are higher than the user proportion of a first preset value, the proportions of all comments are effectively commented, and the user reliability index of a reviewer commenting the news to be identified is higher than the user proportion of a second preset value; identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the method comprises the following steps: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the propagation characteristics of the news to be identified.
In some embodiments, selecting valid reviews from the content of the reviews comprises: and selecting effective comments according to the topic relevance of the comments and the news to be identified and/or the emotional intensity of the comments.
In some embodiments, the presentation of the news is in the form of: text news, picture news, video news, or audio news.
In a fourth aspect, an embodiment of the present invention further provides a false news identification apparatus, including: the ninth acquisition module is used for acquiring news to be identified, acquiring the content of the news to be identified and acquiring the viewpoint of the news to be identified according to the content of the news to be identified; the tenth acquisition module is used for acquiring comments of news to be identified, acquiring the contents of the comments, selecting effective comments according to the contents of the comments, and acquiring the viewpoints of the effective comments according to the contents of the effective comments; and the identification module is used for identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment.
In some embodiments, the apparatus further comprises: the eleventh acquisition module is used for determining a topic of news to be identified, acquiring at least one comparison news related to the topic, and acquiring the content and the viewpoint of the comparison news; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the relationship between the viewpoint of the news to be identified and the viewpoint of the comparison news.
In some embodiments, the device further includes a twelfth obtaining module, configured to obtain a publisher of the news to be identified, obtain publisher user characteristics, obtain a publisher of the effective comment, and obtain reviewer user characteristics; the identification module is specifically configured to: and identifying whether the news to be identified is false news according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment, and the characteristics of the publisher user and the characteristics of the reviewer user.
In some embodiments, the apparatus further comprises: the system further comprises a thirteenth acquisition module, wherein the thirteenth acquisition module is used for acquiring the emotional characteristics of the news and the comments to be identified, and the emotional characteristics of the comments are related to at least one of the following factors: emotion classification, emotion intensity, emotion expression; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the emotional characteristics of the news to be identified and the comment.
In some embodiments, the apparatus further comprises: the system further comprises a fourteenth acquisition module, which is used for acquiring the propagation characteristics of the news to be identified, wherein the propagation characteristics are related to at least one of the following factors: the forwarding amount of the news to be identified, the user reliability index of the user forwarding the news to be identified are higher than the user proportion of a first preset value, the proportions of all comments are effectively commented, and the user reliability index of a reviewer commenting the news to be identified is higher than the user proportion of a second preset value; the identification module is specifically configured to: and identifying whether the news to be identified is false news or not according to the relationship between the viewpoint of the news to be identified and the viewpoint of the effective comment and the propagation characteristics of the news to be identified.
In a fifth aspect, the present invention also provides a false news recognition apparatus, comprising: at least one processor; a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of the first or third aspect of the invention to be carried out. The processor and the memory may be provided separately or may be integrated together.
For example, the memory may include random access memory, flash memory, read only memory, programmable read only memory, non-volatile memory or registers, and the like. The processor may be a Central Processing Unit (CPU) or the like. Or a Graphics Processing Unit (GPU) memory may store executable instructions. The processor may execute executable instructions stored in the memory to implement the various processes described in embodiments of the present application.
It will be appreciated that the memory in this embodiment can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a ROM (Read-only memory), a PROM (programmable Read-only memory), an EPROM (erasable programmable Read-only memory), an EEPROM (electrically erasable programmable Read-only memory), or a flash memory. The volatile memory may be a RAM (random access memory) which serves as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as SRAM (staticaram, static random access memory), DRAM (dynamic RAM, dynamic random access memory), SDRAM (synchronous DRAM ), DDRSDRAM (double data rate SDRAM, double data rate synchronous DRAM), ESDRAM (Enhanced SDRAM, Enhanced synchronous DRAM), SLDRAM (synchlink DRAM, synchronous link DRAM), and DRRAM (directrrambus RAM, direct memory random access memory). The memories described in the embodiments of the present application are intended to comprise, without being limited to, these and any other suitable types of memory.
In some embodiments, the memory stores elements, upgrade packages, executable units, or data structures, or a subset thereof, or an extended set thereof: an operating system and an application program.
The operating system includes various system programs, such as a framework layer, a core library layer, a driver layer, and the like, and is used for implementing various basic services and processing hardware-based tasks. The application programs comprise various application programs and are used for realizing various application services. The program for implementing the method of the embodiment of the present invention may be included in the application program.
In an embodiment of the present invention, the processor is configured to execute the method steps provided in the first aspect or the third aspect by calling a program or an instruction stored in the memory, specifically, a program or an instruction stored in an application program.
In a sixth aspect, an embodiment of the present invention further provides a chip, configured to perform the method in the first aspect or the third aspect. Specifically, the chip includes: a processor for calling and running the computer program from the memory so that the device on which the chip is installed is used for executing the method of the first aspect or the third aspect.
Furthermore, in a seventh aspect, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of the first or third aspect of the present invention.
For example, the machine-readable storage medium may include, but is not limited to, various known and unknown types of non-volatile memory.
In an eighth aspect, the present invention also provides a computer program product, which includes computer program instructions, and the computer program instructions make a computer execute the method in the first aspect or the third aspect.
Those of skill in the art would understand that the elements and algorithm steps of the examples described in connection with the embodiments disclosed in the embodiments disclosed herein may be implemented as electronic hardware, or combinations of software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the technical solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments of the present application, the disclosed system, apparatus and method may be implemented in other ways. For example, the division of the unit is only one logic function division, and there may be another division manner in actual implementation. For example, multiple units or components may be combined or may be integrated into another system. In addition, the coupling between the respective units may be direct coupling or indirect coupling. In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or may exist separately and physically.
It should be understood that, in the various embodiments of the present application, the sequence numbers of the processes do not mean the execution sequence, and the execution sequence of the processes should be determined by the functions and the inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a machine-readable storage medium. Therefore, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a machine-readable storage medium and may include several instructions to cause an electronic device to perform all or part of the processes of the technical solution described in the embodiments of the present application. The storage medium may include various media that can store program codes, such as ROM, RAM, a removable disk, a hard disk, a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, and the scope of the present application is not limited thereto. Those skilled in the art can make changes or substitutions within the technical scope disclosed in the present application, and such changes or substitutions should be within the protective scope of the present application.

Claims (22)

1. A false news identification method, comprising:
acquiring first information to be identified and auxiliary information thereof;
acquiring text contents of the first information and the auxiliary information;
screening effective auxiliary information from the auxiliary information;
acquiring characteristic information of the first information and the effective auxiliary information according to the text content of the first information and the effective auxiliary information;
and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information.
2. The method according to claim 1, wherein the feature information is viewpoint feature information.
3. The method of claim 2, wherein determining whether the first information is false news according to the text content and/or feature information of the first information and the valid auxiliary information comprises:
acquiring a first relation between viewpoint characteristic information of the first information and effective auxiliary information thereof;
and determining whether the first information is false news or not according to the first relation.
4. The method of claim 2 or 3, further comprising:
acquiring at least one piece of second information related to the first information topic;
acquiring text content of the second information, and acquiring viewpoint characteristic information of the second information according to the text content;
the determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes:
acquiring a second relation between viewpoint characteristic information of the first information and viewpoint characteristic information of the second information;
and determining whether the first information is false news or not according to the second relation.
5. The method according to any one of claims 1-4, wherein the screening out valid auxiliary information from the auxiliary information comprises at least one of:
screening effective auxiliary information according to the relevance of the auxiliary information and the first information topic; or
And acquiring the emotional intensity characteristics of the auxiliary information, and screening the effective auxiliary information according to the emotional intensity characteristics.
6. The method according to any of claims 1-4, wherein the first information comprises: text information, picture information, video information, or audio information;
the acquiring of the text content of the first information to be identified includes extracting the text content from the text information, the picture information, the video information or the audio information of the first information.
7. The method according to claim 6, wherein when the first information is video information, the obtaining the text content of the first information to be recognized comprises:
segmenting the video to obtain a single-frame image;
removing the duplication of the single-frame image to obtain effective image information;
and acquiring the text content of the effective image information.
8. The method according to any one of claims 1-7, further comprising:
acquiring user characteristic information, wherein the user characteristic information comprises: characteristic information of a first user publishing the first information, or characteristic information of a second user publishing the auxiliary information;
the determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes:
and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the user characteristic information.
9. The method of claim 8, wherein the user profile information comprises a user reliability indicator.
10. The method according to any one of claims 1-7, further comprising:
obtaining emotional feature information, wherein the emotional feature information comprises at least one of the first information and/or the auxiliary information: emotion classification, emotion intensity and emotion expression;
the determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes:
and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the emotional characteristic information.
11. The method according to any one of claims 1-7, further comprising:
acquiring propagation characteristic information, wherein the propagation characteristic information comprises forwarding characteristics and auxiliary information propagation characteristics of the first information;
the determining whether the first information is false news according to the text content and/or the feature information of the first information and the effective auxiliary information includes:
and determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information and the propagation characteristic information.
12. The method according to claim 11, wherein the obtaining of the propagation characteristic information includes a forwarding characteristic and an auxiliary information propagation characteristic of the first information, and specifically includes:
acquiring the forwarding amount of the first information, and the user proportion of users who forward the first information, wherein the user reliability index is higher than a first preset value, and determining the forwarding characteristic of the first information;
acquiring the auxiliary information quantity of the first information and the effective auxiliary information quantity of the first information, and determining the effective auxiliary information proportion; acquiring a user proportion that the user reliability index of a user publishing the first information auxiliary information is higher than a second preset value; and determining the auxiliary information propagation characteristics according to the auxiliary information quantity, the effective auxiliary information proportion and the user proportion with the user reliability index higher than a second preset value.
13. A false news recognition apparatus, comprising:
the first acquisition module is used for acquiring first information to be identified and auxiliary information thereof;
the second acquisition module is used for acquiring the text contents of the first information and the auxiliary information;
the screening module is used for screening effective auxiliary information from the auxiliary information;
the third acquisition module is used for acquiring the characteristic information of the first information and the effective auxiliary information according to the text content of the first information and the effective auxiliary information;
and the determining module is used for determining whether the first information is false news or not according to the text content and/or the characteristic information of the first information and the effective auxiliary information.
14. The apparatus of claim 13, wherein the feature information is viewpoint feature information, and wherein the determining module comprises: a first obtaining unit, a first determining unit, wherein,
the first obtaining unit is configured to obtain a first relationship between the first information and viewpoint feature information of effective auxiliary information thereof;
the first determining unit is configured to determine whether the first information is false news according to the first relationship.
15. The apparatus of claim 13 or 14, further comprising a fourth acquisition module, the fifth acquisition module:
the fourth obtaining module is used for obtaining at least one piece of second information related to the first information topic;
the fifth obtaining module is configured to obtain text content of the second information, and obtain viewpoint feature information of the second information according to the text content;
the determining module comprises a second obtaining unit and a second determining unit, wherein,
the second acquiring unit is configured to acquire a second relationship between the viewpoint feature information of the first information and the viewpoint feature information of the second information;
the second determining unit is used for determining whether the first information is false news or not according to a second relation.
16. The apparatus of any one of claims 13-15, wherein the screening module comprises at least one of:
the first screening unit is used for screening effective auxiliary information according to the correlation degree of the auxiliary information and the first information topic; or
And the second screening unit is used for acquiring the emotional intensity characteristics of the auxiliary information and screening effective auxiliary information according to the emotional intensity characteristics.
17. The apparatus according to any of claims 13-16, wherein the first information comprises: text information, picture information, video information or audio information, and when the first information is video information, the second obtaining module includes:
the segmentation unit is used for segmenting the video to obtain a single-frame image;
the duplication removing unit is used for removing duplication of the single-frame image to obtain effective image information;
and the third acquisition unit is used for acquiring the text content of the effective image information.
18. The apparatus of any one of claims 13-17, further comprising:
a sixth obtaining module, configured to obtain user characteristic information, where the user characteristic information includes: characteristic information of a first user publishing the first information, or characteristic information of a second user publishing the auxiliary information;
the determining module is specifically configured to determine whether the first information is false news according to the text content and/or feature information of the first information and the valid auxiliary information, and the user feature information.
19. The apparatus of any one of claims 13-17, further comprising:
a seventh obtaining module, configured to obtain emotional feature information, where the emotional feature information includes at least one of the following first information and/or the auxiliary information: the emotion category, the emotion intensity, the emotion expression,
the determining module is specifically configured to determine whether the first information is false news according to the text content and/or feature information of the first information and the effective auxiliary information, and the emotional feature information.
20. The apparatus of any one of claims 13-17, further comprising:
an eighth obtaining module, configured to obtain propagation characteristic information, where the propagation characteristic information includes a forwarding characteristic and an auxiliary information characteristic of the first information;
the determining module is specifically configured to determine whether the first information is false news according to the text content and/or feature information of the first information and the effective auxiliary information, and the propagation feature information.
21. A false news identification device comprising:
at least one processor;
a memory coupled with the at least one processor, the memory storing executable instructions, wherein the executable instructions, when executed by the at least one processor, cause the method of any of claims 1-12 to be implemented.
22. A chip, comprising: a processor for calling and running the computer program from the memory so that the device in which the chip is installed performs: the method of any one of claims 1 to 12.
CN202110940711.7A 2021-08-17 2021-08-17 False news identification method, device, equipment and chip Active CN113704400B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110940711.7A CN113704400B (en) 2021-08-17 2021-08-17 False news identification method, device, equipment and chip

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110940711.7A CN113704400B (en) 2021-08-17 2021-08-17 False news identification method, device, equipment and chip

Publications (2)

Publication Number Publication Date
CN113704400A true CN113704400A (en) 2021-11-26
CN113704400B CN113704400B (en) 2023-04-07

Family

ID=78652924

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110940711.7A Active CN113704400B (en) 2021-08-17 2021-08-17 False news identification method, device, equipment and chip

Country Status (1)

Country Link
CN (1) CN113704400B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117294526A (en) * 2023-11-22 2023-12-26 深圳大智软件技术有限公司 Communication information sharing method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104202339A (en) * 2014-09-24 2014-12-10 广西大学 User behavior based cross-cloud authentication service method
CN106101092A (en) * 2016-06-07 2016-11-09 腾讯科技(深圳)有限公司 A kind of information evaluation processing method and first instance
CN106484679A (en) * 2016-10-20 2017-03-08 北京邮电大学 A kind of false review information recognition methodss being applied on consumption platform and device
CN110083827A (en) * 2019-03-28 2019-08-02 无锡天脉聚源传媒科技有限公司 Deceptive information discrimination method, system and storage medium based on machine learning
US20210089579A1 (en) * 2019-09-23 2021-03-25 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for collecting, detecting and visualizing fake news

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104202339A (en) * 2014-09-24 2014-12-10 广西大学 User behavior based cross-cloud authentication service method
CN106101092A (en) * 2016-06-07 2016-11-09 腾讯科技(深圳)有限公司 A kind of information evaluation processing method and first instance
CN106484679A (en) * 2016-10-20 2017-03-08 北京邮电大学 A kind of false review information recognition methodss being applied on consumption platform and device
CN110083827A (en) * 2019-03-28 2019-08-02 无锡天脉聚源传媒科技有限公司 Deceptive information discrimination method, system and storage medium based on machine learning
US20210089579A1 (en) * 2019-09-23 2021-03-25 Arizona Board Of Regents On Behalf Of Arizona State University Method and apparatus for collecting, detecting and visualizing fake news

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
楼靓;: "社交网络虚假新闻识别方法" *
苏畅: "网络虚假新闻检测系统的研究与实现", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117294526A (en) * 2023-11-22 2023-12-26 深圳大智软件技术有限公司 Communication information sharing method and system
CN117294526B (en) * 2023-11-22 2024-03-12 深圳大智软件技术有限公司 Communication information sharing method and system

Also Published As

Publication number Publication date
CN113704400B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
Xue et al. Detecting fake news by exploring the consistency of multimodal data
CN109684481A (en) The analysis of public opinion method, apparatus, computer equipment and storage medium
CN111400607B (en) Search content output method and device, computer equipment and readable storage medium
CN112541476B (en) Malicious webpage identification method based on semantic feature extraction
CN105279277A (en) Knowledge data processing method and device
CN110287292B (en) Judgment criminal measuring deviation degree prediction method and device
CN109087205A (en) Prediction technique and device, the computer equipment and readable storage medium storing program for executing of public opinion index
CN112258254B (en) Internet advertisement risk monitoring method and system based on big data architecture
CN111783712A (en) Video processing method, device, equipment and medium
WO2021098651A1 (en) Method and apparatus for acquiring risk entity
CN112257452A (en) Emotion recognition model training method, device, equipment and storage medium
CN110287314A (en) Long text credibility evaluation method and system based on Unsupervised clustering
CN111078979A (en) Method and system for identifying network credit website based on OCR and text processing technology
CN110990563A (en) Artificial intelligence-based traditional culture material library construction method and system
CN112069312A (en) Text classification method based on entity recognition and electronic device
CN114155529A (en) Illegal advertisement identification method combining character visual features and character content features
CN113282754A (en) Public opinion detection method, device, equipment and storage medium for news events
CN114495128A (en) Subtitle information detection method, device, equipment and storage medium
CN113704400B (en) False news identification method, device, equipment and chip
CN115100664A (en) Multi-mode false news identification method and system based on correlation information expansion
De Zarate et al. Vocabulary-Based Method for Quantifying Controversy in Social Media.
Agarwal et al. Can twitter help to predict outcome of 2019 indian general election: A deep learning based study
CN108595466B (en) Internet information filtering and internet user information and network card structure analysis method
Mahapatra et al. Automatic hierarchical table of contents generation for educational videos
CN112183093A (en) Enterprise public opinion analysis method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant