CN117235343A - Short video data processing system and processing method based on image processing technology monitoring - Google Patents

Short video data processing system and processing method based on image processing technology monitoring Download PDF

Info

Publication number
CN117235343A
CN117235343A CN202310494088.6A CN202310494088A CN117235343A CN 117235343 A CN117235343 A CN 117235343A CN 202310494088 A CN202310494088 A CN 202310494088A CN 117235343 A CN117235343 A CN 117235343A
Authority
CN
China
Prior art keywords
public opinion
data
module
information
short video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310494088.6A
Other languages
Chinese (zh)
Inventor
苏华权
黄忠靖
裴求根
彭泽武
刘晔
龙震岳
梁哲恒
江疆
周婷
梁盈威
谢瀚阳
冯歆尧
朱泰鹏
林嘉鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Power Grid Co Ltd
Original Assignee
Guangdong Power Grid Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Power Grid Co Ltd filed Critical Guangdong Power Grid Co Ltd
Priority to CN202310494088.6A priority Critical patent/CN117235343A/en
Publication of CN117235343A publication Critical patent/CN117235343A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a short video data processing system and a processing method based on image processing technology monitoring, wherein the short video data processing system comprises a data source layer, a data analysis platform layer, an intelligent analysis platform layer, an application platform layer and a service presentation layer, wherein the data source layer comprises a brand platform library and a third party search engine, different keyword schemes are set according to different public opinion events, and the system collects and monitors short video information of platforms such as tremble, fast handedness, microblog, redbook, beep and the like through a crawler technology. The system can be used for screening public opinion information, public opinion keywords are set to monitor public opinion, and the system is matched with the corresponding keywords, so that the information can be acquired and displayed. And establishing a basic database and a management database which are integrated with short video public opinion information collection, classification, monitoring, analysis, research and judgment and a comprehensive, efficient and visual network public opinion supervision system matched with the basic database and the management database.

Description

Short video data processing system and processing method based on image processing technology monitoring
Technical Field
The invention relates to a video processing technology, in particular to a short video processing system and a processing method for brand monitoring based on an image processing technology, which are used for public opinion big data processing application.
Background
Currently, the mainstream short video platforms include tremble voice, microblog, fast hand, sergeant, today's head strip, small red book and the like, and each short video platform is likely to generate public opinion, so that the monitoring range is quite huge, and the monitoring workload is quite heavy. This requires that we find the relevant public opinion in real time, track and monitor in real time, report and summarize in real time when monitoring, but the time and effort of staff are limited, and manual detection is impossible. The current method is to preset monitoring keywords according to the monitoring field, compare whether identical keywords appear in the system in topics of the network heat theory, determine that public opinion is generated when identical keywords appear, and then locate the source of the public opinion. This reduces the workload of people to some extent, but because each public opinion has its own characteristics, people cannot predict all topic keywords, so the accuracy of information is still very limited. Therefore, how to accurately collect and monitor public opinion information on the network becomes an important issue.
Disclosure of Invention
The invention provides the short video data processing system and the processing method which have the advantages of simple and effective monitoring, comprehensive monitoring, high monitoring accuracy and visual monitoring based on the image processing technology, and are used for solving the current situations that the monitoring range is quite huge, the monitoring workload is large, the monitoring cannot be in place in time, the monitoring accuracy is low and the like in the existing network public opinion monitoring.
The invention adopts the concrete technical scheme for solving the technical problems that: a short video data processing system based on image processing technology monitoring, characterized in that: the system comprises a data source layer, a data analysis platform layer, an intelligent analysis platform layer, an application platform layer and a service presentation layer, wherein the data source layer comprises a brand platform library and a third party search engine, and the brand platform library comprises but is not limited to tremble, fast hands, red books, microblogs, today's top bars and beeps; the data analysis platform layer comprises a preprocessing layer and an information acquisition layer, the information acquisition layer is used for carrying out information acquisition processing on the data source layer, the preprocessing layer is used for carrying out information preprocessing on acquisition information obtained after the information acquisition processing on the information acquisition layer, and a short video public opinion information management database is established; the intelligent analysis platform layer comprises a statistical analysis unit and a search engine unit, the statistical analysis unit is used for constructing an analysis guide control service linkage command basic platform, the application platform layer carries out network public opinion management application on short video public opinion analysis data obtained by the data source layer, the data analysis platform layer and the intelligent analysis platform layer, a network public opinion monitoring system is constructed and formed, and the service presentation layer is used for presenting the data source layer, the data analysis platform layer, the intelligent analysis platform layer and the application platform layer and providing corresponding visual presentation for network public opinion supervision application command. A basic platform for the linkage command of the Internet short video public opinion information acquisition, analysis and guide control service is built, and a basic database and a management database which are integrated with the short video public opinion source information collection, classification, monitoring, analysis, research and judgment and presentation are built, and a comprehensive, efficient and visual network public opinion monitoring system which is matched with the basic database and the management database are built. The monitoring system is simple and effective, comprehensive in monitoring, high in monitoring time and high in monitoring accuracy, and can monitor intuitively.
Preferably, the information acquisition layer comprises an acquisition source configuration module, a priority configuration module, an acquisition agent module, an acquisition group control module, an acquisition monitoring module and a data cleaning module; the acquisition source configuration module is used for facilitating the configuration acquisition of the short video public opinion information acquisition source by the network information mining engine, the priority configuration module is used for configuring priority storage levels of the acquired public opinion information, and the acquisition proxy module is used for transmitting the acquired short video public opinion data to the server at regular time and responding to monitoring requirements sent by the control console in real time; the acquisition group control module is used for uniformly controlling and managing a plurality of short video public opinion data targets; the acquisition monitoring module is used for monitoring and controlling the short video public opinion data field operation equipment acquired by monitoring; the data cleaning module is used for rechecking and verifying the collected public opinion data and deleting repeated public opinion data information. And correcting the error data information and providing consistency of public opinion data information. And the comprehensive, timely, accurate and effective collection of the public opinion information of each short video is improved.
Preferably, the preprocessing layering comprises a public opinion early warning module, a data summarizing module, an automatic abstracting module, a Chinese word segmentation module, a construction index module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module and a hotword extraction module; the public opinion early warning module is used for realizing automatic early warning according to preset early warning key words, and the data summarizing module is used for classifying and summarizing public opinion data records according to set standards; the automatic abstract module is used for automatically converting and generating short video public opinion brief compressed information, and the Chinese word segmentation module is used for carrying out Chinese word segmentation and automatically identifying short video public opinion sentence meanings; the system comprises an index building module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module, a data warehouse, a data storage module, a keyword extraction module and a keyword extraction module, wherein the index building module is used for creating all configurations related to definition and index, the similar content merging module is used for processing and judging similar short video public opinion data, the increment synchronization module is used for synchronizing daily increment change data to the data storage module, the automatic classification module is used for carrying out index sorting processing by self-defining classification standards, the keyword extraction module is used for extracting candidate words and judging output keywords; the hot word extraction module is used for extracting high-frequency hot words. Improving the comprehensive, timely, accurate and effective processing of the public opinion information of each short video
Preferably, the statistical analysis unit comprises a media attention degree module, an automatic clustering module, a word group relation construction module, a trend analysis module, a statistical report module, a theme research and judgment module, an analysis model construction module, a hot event module, a hot word analysis module, a propagation track module and an industry index release module; the media attention module is used for classifying attributes of different short video public opinion network platforms and account numbers, judging whether the network public opinion media is available, and counting and refining repeatedly-appearing public opinion attention words and sentences and titles; the automatic clustering module is used for automatically gathering and classifying public opinion data with the same and similar content, and the word group relation building module is used for automatically extracting and judging and building public opinion text center meaning and attribute library; the trend analysis module is used for calculating a historical development track of the public opinion event by taking the occurrence frequency of the public opinion event and the self-defined period as units, and calculating a future development trend; the statistical report module is used for presenting the processed and counted public opinion data in a form, and the topic studying and judging module is used for capturing the public opinion text center thought and judging the attribute of the public opinion text; the analysis model construction module is used for customizing option weights and analyzing and constructing different types of public opinion data; the trending event module is used for counting high-frequency occurrence public opinion events and displaying the high-frequency occurrence public opinion events; the hot word analysis module is used for extracting high-frequency occurrence words in all public opinion events and displaying the high-frequency occurrence words, and the propagation track module is used for tracing the historical development track of the public opinion events and calculating the propagation track of the research and judgment public opinion events; the industry index issuing module is used for reflecting index indexes of development conditions of various industries in the market. The statistical analysis diversity and comprehensive effectiveness of the public opinion information of each short video are improved.
Preferably, the application platform layer comprises an internet application unit, a classification detection unit, a public opinion report unit, a hot spot public opinion unit, a keyword configuration unit, an early warning configuration unit, an address book management unit and a user management unit; the Internet application unit is used for executing collection and summarization of Internet application on the application platform, the classification detection unit is used for executing short video public opinion information to perform independent classification detection, the public opinion report unit is used for executing public opinion information report generation, the hot public opinion unit is used for executing high-frequency public opinion presentation in a specific time period, the keyword configuration unit is used for executing edit configuration according to keywords and word input rejection grammar, the early warning configuration unit is used for executing configuration push early warning information, the address book management unit is used for managing contacts associated with the short video data processing system, and the user management unit is used for executing user list and user authority under a management account. The comprehensive and timely effectiveness of application of the short video public opinion information is improved.
Another object of the present invention is to provide a short video data processing method based on image processing technology monitoring, which is characterized in that: comprising the following data processing method
A1. The data architecture processing method comprises the following steps: dividing Internet mainstream short video platform public opinion videos into a data acquisition module, a data processing module, a data management and storage module and a data display module by the network public opinion monitoring system, and carrying out data analysis and management after the Internet public opinion data acquisition to realize data visualization;
A2. the data acquisition technology processing method comprises the following steps: the data acquisition is completed through the cooperation of a crawler server cluster, and the data acquisition is performed through the mass public opinion data obtained by monitoring the current headbands, fast hands, microblogs, tremble sounds, redbooks and beep knots, and screening useful public opinion data information;
A3. the processing method of the data deduplication technology comprises the following steps: the data deduplication is also called deduplication, which is to find out and delete duplicate public opinion data in a digital file set, and only store unique data units; while deleting, consider data reconstruction, namely although some content of the file is deleted, when needing, still reconstruct the complete file content, this needs to keep the index information between file and unique data unit;
a4, OCR recognition technology processing method: the system acquires data of a social media platform through the intelligent crawler system, downloads videos in batches in a high concurrency mode, and performs frame extraction to acquire pictures, so that the OCR system can quickly and accurately locate characters on the pictures and identify the characters; OCR recognition is carried out on scenes and objects appearing in the short video, and meanwhile, OCR recognition extraction is carried out on subtitle content in the short video, so that the full-scale acquisition and monitoring of public opinion information are facilitated;
A5. The processing method of the video feature recognition technology comprises the following steps: acquiring data of a social media platform through an intelligent crawler system, downloading videos in batches in a high concurrency mode, and performing frame extraction to acquire pictures;
A6. the construction method of the short video network public opinion monitoring system for enterprise image public opinion supervision comprises the following steps: through the monitoring of enterprise image related public opinion through the popular short video platform including but not limited to trembling, microblog, fast handhold, today's head, sergey and small red books, for the public opinion information related to enterprise management, service response, power supply and staff and company brand image, through the full-automatic real-time monitoring, analysis and early warning of 7×24 hours, the internet public opinion risk control capability of the monitored company is systematically improved, the monitored company is assisted to find network public opinion more efficiently, timely and comprehensively, and a precedent is acquired for public opinion treatment and guidance;
A7. the enterprise image public opinion analysis processing method comprises the following steps: through the professional analysis capability of the network public opinion monitoring system, accurate public opinion information retrieval and deep analysis reference are obtained, a monitored company is helped to master the network public opinion trend of a short video platform in time, important events with great influence are rapidly found and rapidly processed, public opinion and publicity are guided from the front, decision basis is provided for public opinion management, brand images of the company are better maintained by the power-assisted monitored company, and a good public opinion environment is provided for enterprise development.
The method improves the timely, comprehensive and efficient performance of processing and identifying the network public opinion video data of each short video platform and constructing the enterprise image, rapidly discovers and rapidly processes important events with larger influence, guides public opinion and propaganda from the front, provides decision basis for public gateway management, better maintains the brand image of the monitored company, and provides good benign boosting public opinion environment for enterprise development.
Preferably, the data processing method comprises the following specific processing modes:
B1. setting a keyword scheme to inquire and monitor public opinion information, and configuring different keywords according to actual service requirements to acquire related information in a short video platform; the method provides a convenience scheme for public opinion hotspot discovery and key public opinion monitoring;
B2. after the scheme is set, public opinion data monitored under the scheme can be checked in an information list;
B3. setting a shortcut search, and setting one or two simple keywords for preliminary retrieval of information;
B4. short video users with large vermicelli quantity, high liveness and/or large influence are screened according to service requirements to generate key account pools, corresponding key users are added according to service precipitation to monitor, key accounts of different platforms are monitored and collected, public opinion information issued by key account Chi Duanshi frequency users is monitored, and further AI analysis is achieved;
B5. Visual data: analyzing and judging the scheme from multiple dimensions through a visual data large screen, and presenting the analyzed data in a chart form, wherein the chart presentation form comprises sensitive information content, popular information content, publisher regional distribution information, emotion trend graphs, work release interaction trend and information source distribution;
B6. aiming at each public opinion scheme, the network public opinion monitoring system analyzes and processes public opinion data from multiple dimensions, supports the generation of a data chart according to a single scheme set by a user, and the data chart comprises public opinion source distribution, keyword public opinion volume, content word cloud and sensitive classification related information;
B7. according to the monitoring scheme of the network public opinion monitoring system, different data of relevant days, weeks and months are obtained and counted into reports, so that a monitored company is helped to know the recent public opinion conditions, and the efficiency of public opinion treatment is improved;
B8. the network public opinion monitoring system collects whole network public opinion data and tracks and analyzes according to the overall propagation condition of public opinion information on each short video platform, and automatically generates a multi-dimensional comprehensive analysis report covering event trends, website statistics and propagation paths.
The high-efficiency and timely feasibility of processing the network public opinion video key data of each short video platform is improved, and the comprehensiveness and the presentity of the tracking analysis of the public opinion video key data are improved.
Preferably, the processing method of the OCR technology in the step A4 includes the following technology implementation steps:
C1. image preprocessing, namely, for better text line positioning and recognition, so that the accuracy of public opinion data recognition is improved;
C2. positioning all text lines of the document image, and converting text information into editable text information;
C3. correcting the recognition result according to rules and big data analysis, and improving the accuracy of character recognition;
C4. correcting the recognition result according to the original text image, and improving the accuracy of character recognition;
C5. and restoring the identification result to the WEB interface according to the original layout of the original text image.
The method improves the efficiency and timely effectiveness of the OCR technology processing of the network public opinion video data of each short video platform.
Preferably, in the processing method of the data acquisition technology in step A2, the monitored website is continuously scanned through a massive address pool of IP addresses and simulating natural person access behaviors, the acquired data is stored in a distributed storage service cluster, and all actions and action logs are recorded in a log server cluster. The collected data is transmitted to sub-modules of emotion judgment, natural language recognition and the like of data processing through a collected data interface. The method improves the high-efficiency timeliness of the public opinion data information screening processing IP address monitoring judgment acquisition.
Preferably, in the above-mentioned public opinion data information screening processing of step A2, the public opinion information of each short video platform is collected and monitored, and is used for screening public opinion information, public opinion keywords are set to monitor public opinion, and the network public opinion monitoring system matches the corresponding keywords, and then the information is collected and displayed; and (3) carrying out key acquisition and monitoring on short video users with high vermicelli quantity and/or high influence by establishing a key account pool, and monitoring public opinion information issued by the short video users. And the public opinion data information screening processing key monitoring collection and time effectiveness are improved.
Preferably, the data acquisition module in the step A1 performs internet short video information acquisition on the top of today, the microblog, the fast hand, the tremble, the sergeant and the redbook through a crawler server cluster, and transmits the information to the data processing module after being subjected to UR deduplication, collaborative crawler, known website template matching and unknown website automatic calculation, the data processing of the data processing module in the step A1 comprises automatic summarization, noise calculation, text classification, text word segmentation, viewpoint extraction, region identification, sensitive discovery, hot spot calculation, burst calculation, event extraction and subject word extraction, and the management storage of the data management and storage module in the step A1 comprises data distribution, data storage, automatic backup, distributed indexing, query management and advanced calculation, and the data presentation of the data presentation module in the step A1 comprises user configuration, query request and data presentation of a front-end WEB interface. And the comprehensive, efficient and timely presentability of data acquisition is improved.
The beneficial effects of the invention are as follows: the project aims to build an Internet short video public opinion information acquisition analysis guide control service linkage command basic platform, and build a basic database and a management database which are integrated with short video public opinion information collection, classification, monitoring, analysis, research and judgment and presentation, and a comprehensive, efficient and visual network public opinion supervision system which is matched with the basic database and the management database. According to different public opinion events, different keyword schemes are set, and the system collects and monitors short video information of platforms such as tremble, fast handedness, microblog, redbook, beep, present head, and the like through a crawler technology. The system can be used for screening public opinion information, public opinion keywords are set to monitor public opinion, and the system is matched with the corresponding keywords, so that the information can be acquired and displayed.
Drawings
The invention is described in further detail below with reference to the drawings and the detailed description.
FIG. 1 is a schematic block diagram of a short video data processing system and method based on image processing technology monitoring of the present invention.
Fig. 2 is a flow chart of a method for using the internet public opinion monitoring system in the short video data processing system and the processing method based on the image processing technology monitoring of the present invention.
FIG. 3 is a flow chart of a data structure processing method in the short video data processing system and the processing method based on image processing technology monitoring.
Fig. 4 is a schematic flow chart of a data acquisition technology processing method in the short video data processing system and the processing method based on image processing technology monitoring.
Fig. 5 is a schematic flow chart of a processing method of the data deduplication technology in the short video data processing system and the processing method based on image processing technology monitoring.
FIG. 6 is a schematic flow chart of the OCR technology processing method in the short video data processing system and the processing method based on the image processing technology monitoring.
Fig. 7 is a schematic flow chart of a video feature recognition technology processing method in the short video data processing system and the processing method based on image processing technology monitoring.
Fig. 8 is a schematic diagram of an interface structure for inquiring and monitoring public opinion information by setting a keyword scheme in the short video data processing system and method based on image processing technology monitoring.
Fig. 9 is a schematic diagram of an interface structure for viewing public opinion data in the short video data processing system and processing method based on image processing technology monitoring according to the present invention.
FIG. 10 is a schematic diagram of an interface structure for quick retrieval in a short video data processing system and method based on image processing technology monitoring according to the present invention.
Fig. 11 is a schematic diagram of an interface structure for monitoring a key account pool in a short video data processing system and a processing method based on image processing technology monitoring.
Fig. 12 is a schematic diagram of another interface structure for monitoring a key account pool in the short video data processing system and the processing method based on image processing technology monitoring.
FIG. 13 is a schematic diagram of an interface structure of visual data presentation in a short video data processing system and method of the present invention based on image processing technology monitoring.
Fig. 14 is a schematic diagram of a multi-dimensional data analysis interface structure for each public opinion scheme in the short video data processing system and processing method based on image processing technology monitoring according to the present invention.
Fig. 15 is a schematic diagram of a multi-dimensional data analysis interface structure for each public opinion scheme in the short video data processing system and processing method based on image processing technology monitoring according to the present invention.
Detailed Description
Example 1:
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, and 15, a short video data processing system based on image processing technology monitoring includes a data source layer, a data analysis platform layer, an intelligent analysis platform layer, an application platform layer, and a service presentation layer, where the data source layer includes a brand platform library and a third party search engine, and the brand platform library includes but is not limited to tremble, fast-handed, redbook, microblog, today's headstripe, and beep; the data analysis platform layer comprises a preprocessing layer and an information acquisition layer, the information acquisition layer is used for carrying out information acquisition processing on the data source layer, the preprocessing layer is used for carrying out information preprocessing on acquisition information obtained after the information acquisition processing on the information acquisition layer, and a short video public opinion information management database is established; the intelligent analysis platform layer comprises a statistical analysis unit and a search engine unit, the statistical analysis unit is used for constructing an analysis guide control service linkage command basic platform, the application platform layer carries out network public opinion management application on short video public opinion analysis data obtained by the data source layer, the data analysis platform layer and the intelligent analysis platform layer, a network public opinion monitoring system is constructed and formed, and the service presentation layer is used for presenting the data source layer, the data analysis platform layer, the intelligent analysis platform layer and the application platform layer and providing corresponding visual presentation for network public opinion supervision application command. Brand monitoring refers to, but is not limited to, short video platform monitoring of well known web names public opinion such as trembling, fast-handedness, redbooks, microblogs, present-day headings and beeps.
The information acquisition layering comprises an acquisition source configuration module, a priority configuration module, an acquisition agent module, an acquisition group control module, an acquisition monitoring module and a data cleaning module; the acquisition source configuration module is used for facilitating the configuration acquisition of the short video public opinion information acquisition source by the network information mining engine, the priority configuration module is used for configuring priority storage levels of the acquired public opinion information, and the acquisition proxy module is used for transmitting the acquired short video public opinion data to the server at regular time and responding to monitoring requirements sent by the control console in real time; the acquisition group control module is used for uniformly controlling and managing a plurality of short video public opinion data targets; the acquisition monitoring module is used for monitoring and controlling the short video public opinion data field operation equipment acquired by monitoring; the data cleaning module is used for rechecking and verifying the collected public opinion data and deleting repeated public opinion data information. And correcting the error data information and providing consistency of public opinion data information.
And an acquisition source configuration module: the information acquisition system is built based on a network information mining engine, and can acquire the latest information from different Internet sites in the shortest time and classify and unify formats.
Priority configuration module: storing priority configuration information for acquisition tables
The collection agent module: and (3) collecting and storing data at fixed time, transmitting the collected data to a server at fixed time, responding to a monitoring request sent by a control console, transmitting real-time screen snapshot information, and controlling the operation of a computer and a user according to the setting of the system.
And the acquisition group control module: the group control technology refers to a technology for uniformly controlling and managing a plurality of targets by utilizing a computer technology and a network communication technology. It is usually through a central control node, connect the goal that many people are controlled, realize the remote control and management to these goal.
And the acquisition monitoring module is used for: the system is a production process control and dispatch automation system based on computer technology, and can monitor and control on-site operation equipment to realize the functions of data acquisition, equipment control, parameter measurement and adjustment, alarm of various signals and the like. The system mainly comprises 3 parts of a lower computer system, an upper computer system and a communication network system for connecting the lower computer system and the upper computer system.
And a data cleaning module: the process of re-examining and checking data aims at deleting duplicate information, correcting existing errors and providing data consistency by converting dirty data into data meeting data quality requirements using related techniques such as mathematical statistics, data mining or predefined cleaning rules.
The preprocessing layering comprises a public opinion early warning module, a data summarizing module, an automatic abstracting module, a Chinese word segmentation module, a construction index module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module and a hotword extraction module; the public opinion early warning module is used for realizing automatic early warning according to preset early warning key words, and the data summarizing module is used for classifying and summarizing public opinion data records according to set standards; the automatic abstract module is used for automatically converting and generating short video public opinion brief compressed information, and the Chinese word segmentation module is used for carrying out Chinese word segmentation and automatically identifying short video public opinion sentence meanings; the system comprises an index building module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module, a data warehouse, a data storage module, a keyword extraction module and a keyword extraction module, wherein the index building module is used for creating all configurations related to definition and index, the similar content merging module is used for processing and judging similar short video public opinion data, the increment synchronization module is used for synchronizing daily increment change data to the data storage module, the automatic classification module is used for carrying out index sorting processing by self-defining classification standards, the keyword extraction module is used for extracting candidate words and judging output keywords; the hot word extraction module is used for extracting high-frequency hot words. In particular
Public opinion early warning module: by judging the sensitivity degree of the information, if the sensitivity of the information reaches a threshold value, the system can perform early warning, and the automatic early warning of sensitive words is realized mainly by automatically judging the region, the semantics and the positive and negative surfaces of the whole network information. The user can realize automatic early warning according to the early warning keywords which are preset by the user, such as a keyword system of power-off, electric shock and the like.
And a data summarizing module: records in the data are classified according to certain standards or requirements and then summarized in the modes of summation, average value, number, maximum (small) value and the like.
Automatic summary module: an information compression technology for automatically converting text (or text set) into short abstract by computer according to some application requires sufficient information quantity, wide coverage, low redundancy and high readability. Technically, the method is mainly divided into a removable abstract and a generated abstract.
The Chinese word segmentation module is used for segmenting a Chinese character sequence into individual words, wherein Chinese word segmentation is the basis of text mining, and the effect of automatically identifying sentence meaning of a computer can be achieved by successfully carrying out Chinese word segmentation on an input Chinese segment.
And (3) constructing an index module: creating and defining all configurations associated with the index for each index to avoid repeated scans; the collected and scanned data is fragmented and divided into static index and dynamic index.
Similar content merging module: in a similarity module, the unit is used as a field, and the similarity among the fields is judged and divided into default similarity and custom similarity; during the merge phase, stored in segments, smaller segments are periodically merged into larger segments to maintain index size and clear erasures.
An increment synchronization module: new and changed data in the incoming data is synchronized to the data warehouse. A table of daily incremental synchronizations may be used, typically requiring a full synchronization at first day
Automatic classification module: the user-defined classification standards are used for index sorting, so that the efficiency is improved, any content difference exists in the data, and the data are judged to be different indexes, so that a large amount of identical data can be conveniently jumped.
Keyword extraction module: firstly, extracting candidate words, judging whether each candidate word is a keyword or not by using a label, and feeding back the keyword to a keyword extraction classifier for training. And extracting candidate words by using the trained keyword extraction classifier, judging labels, and finally outputting keywords.
The hot word extraction module: the appearance frequencies of the same words in different indexes are compared and sequenced, high-frequency appearance words (hot words) are extracted, and the hot word extraction and display quantity can be customized.
The statistical analysis unit comprises a media attention degree module, an automatic clustering module, a word group relation construction module, a trend analysis module, a statistical report module, a theme research and judgment module, an analysis model construction module, a hot event module, a hot word analysis module, a propagation track module and an industry index release module; the media attention module is used for classifying attributes of different short video public opinion network platforms and account numbers, judging whether the network public opinion media is available, and counting and refining repeatedly-appearing public opinion attention words and sentences and titles; the automatic clustering module is used for automatically gathering and classifying public opinion data with the same and similar content, and the word group relation building module is used for automatically extracting and judging and building public opinion text center meaning and attribute library; the trend analysis module is used for calculating a historical development track of the public opinion event by taking the occurrence frequency of the public opinion event and the self-defined period as units, and calculating a future development trend; the statistical report module is used for presenting the processed and counted public opinion data in a form, and the topic studying and judging module is used for capturing the public opinion text center thought and judging the attribute of the public opinion text; the analysis model construction module is used for customizing option weights and analyzing and constructing different types of public opinion data; the trending event module is used for counting high-frequency occurrence public opinion events and displaying the high-frequency occurrence public opinion events; the hot word analysis module is used for extracting high-frequency occurrence words in all public opinion events and displaying the high-frequency occurrence words, and the propagation track module is used for tracing the historical development track of the public opinion events and calculating the propagation track of the research and judgment public opinion events; the industry index issuing module is used for reflecting index indexes of development conditions of various industries in the market. In particular
Media attention module: and counting the issuing platform and issuing account data, classifying the attributes of different platforms and accounts, and judging whether the media is the media. And comparing the text of the platform with the text of the account number, counting and refining repeatedly appearing words and sentences and titles, and generating a media attention value through an algorithm.
And an automatic clustering module: and aggregating and classifying the data with the same and similar contents so as to skip different data, improve the data calling efficiency and work in cooperation with other modules.
The word group relation building module: through NLP natural language analysis, word and sentence relationships within text are analyzed to extract and determine text center meaning and attributes.
Trend analysis module: extracting and counting repeated words and sentences and titles, classifying and counting the events by matching with event attribute classification, calculating event history development tracks by taking event occurrence frequency and user-defined time period as units, and calculating future development trend.
And a statistical report module: the processed and counted data is presented in tabular form.
Theme research module: based on NLP natural language analysis, grabbing a text center idea, and judging attributes (topics) of the text.
The analysis model construction module: and (5) customizing the analysis model and customizing the option weight to perform different types of analysis.
A hot event module: and counting the high-frequency occurrence event on the basis of keyword extraction and topic research and judgment, and displaying the high-frequency occurrence event.
And a hotword analysis module: and extracting high-frequency occurrence words in all the events for display.
Propagation trajectory module: according to text content comparison and NLP natural language analysis, the historical development track of the event is traced back, and the event propagation track is calculated by taking the occurrence frequency of the same and similar event and a specific time period as units.
Industry index issuing module: the industry index specifically represents an index which can represent the development status of various industries in the market.
The application platform layer comprises an Internet application unit, a classification detection unit, a public opinion report unit, a hot public opinion unit, a keyword configuration unit, an early warning configuration unit, an address book management unit and a user management unit; the Internet application unit is used for executing collection and summarization of Internet application on the application platform, the classification detection unit is used for executing short video public opinion information to perform independent classification detection, the public opinion report unit is used for executing public opinion information report generation, the hot public opinion unit is used for executing high-frequency public opinion presentation in a specific time period, the keyword configuration unit is used for executing edit configuration according to keywords and word input rejection grammar, the early warning configuration unit is used for executing configuration push early warning information, the address book management unit is used for managing contacts associated with the short video data processing system, and the user management unit is used for executing user list and user authority under a management account. Specific:
Classification detection unit: independent classification based on different classifications
Public opinion report unit: generating a report according to a preset fixed template, and selecting the public opinion amount in the report
Hot spot public opinion unit: presenting high frequency public opinion within a specific time period
Keyword configuration unit: setting and editing keywords and exclusion words in the scheme, and editing according to the keyword and exclusion word input grammar
An early warning configuration unit: and setting information early warning, and pushing early warning information according to information attributes, sources, media, IP (Internet protocol) attribution and other options.
Address book management unit: contacts associated with the system are managed.
User management unit: and managing a user list and user rights under the account.
Example 2:
in the embodiments shown in fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, fig. 6, fig. 7, fig. 8, fig. 9, fig. 10, fig. 11, fig. 12, fig. 13, fig. 14, fig. 15, a short video data processing method based on image processing technology monitoring includes the following data processing methods:
A1. the data architecture processing method comprises the following steps: dividing Internet mainstream short video platform public opinion videos into a data acquisition module, a data processing module, a data management and storage module and a data display module by the network public opinion monitoring system, and carrying out data analysis and management after the Internet public opinion data acquisition to realize data visualization;
A2. The data acquisition technology processing method comprises the following steps: the data acquisition is completed through the cooperation of a crawler server cluster, and the data acquisition is performed through the mass public opinion data obtained by monitoring the current headbands, fast hands, microblogs, tremble sounds, redbooks and beep knots, and screening useful public opinion data information;
A3. the processing method of the data deduplication technology comprises the following steps: the data deduplication is also called deduplication, which is to find out and delete duplicate public opinion data in a digital file set, and only store unique data units; while deleting, consider data reconstruction, namely although some content of the file is deleted, when needing, still reconstruct the complete file content, this needs to keep the index information between file and unique data unit;
a4, OCR recognition technology processing method: the system acquires data of a social media platform through the intelligent crawler system, downloads videos in batches in a high concurrency mode, and performs frame extraction to acquire pictures, so that the OCR system can quickly and accurately locate characters on the pictures and identify the characters; OCR recognition is carried out on scenes and objects appearing in the short video, and meanwhile, OCR recognition extraction is carried out on subtitle content in the short video, so that the full-scale acquisition and monitoring of public opinion information are facilitated;
A5. The processing method of the video feature recognition technology comprises the following steps: acquiring data of a social media platform through an intelligent crawler system, downloading videos in batches in a high concurrency mode, and performing frame extraction to acquire pictures;
A6. the construction method of the short video network public opinion monitoring system for enterprise image public opinion supervision comprises the following steps: through the monitoring of enterprise image related public opinion through the popular short video platform including but not limited to trembling, microblog, fast handhold, today's head, sergey and small red books, for the public opinion information related to enterprise management, service response, power supply and staff and company brand image, through the full-automatic real-time monitoring, analysis and early warning of 7×24 hours, the internet public opinion risk control capability of the monitored company is systematically improved, the monitored company is assisted to find network public opinion more efficiently, timely and comprehensively, and a precedent is acquired for public opinion treatment and guidance;
A7. the enterprise image public opinion analysis processing method comprises the following steps: through the professional analysis capability of the network public opinion monitoring system, accurate public opinion information retrieval and deep analysis reference are obtained, a monitored company is helped to master the network public opinion trend of a short video platform in time, important events with great influence are rapidly found and rapidly processed, public opinion and publicity are guided from the front, decision basis is provided for public opinion management, brand images of the company are better maintained by the power-assisted monitored company, and a good public opinion environment is provided for enterprise development.
The data processing method comprises the following specific processing modes
B1. Setting a keyword scheme to inquire and monitor public opinion information, and configuring different keywords according to actual service requirements to acquire related information in a short video platform; the method provides a convenience scheme for public opinion hotspot discovery and key public opinion monitoring;
B2. after the scheme is set, public opinion data monitored under the scheme can be checked in an information list;
B3. setting a shortcut search, and setting one or two simple keywords for preliminary retrieval of information;
B4. short video users with large vermicelli quantity, high liveness and/or large influence are screened according to service requirements to generate key account pools, corresponding key users are added according to service precipitation to monitor, key accounts of different platforms are monitored and collected, public opinion information issued by key account Chi Duanshi frequency users is monitored, and further AI analysis is achieved;
B5. visual data: analyzing and judging the scheme from multiple dimensions through a visual data large screen, and presenting the analyzed data in a chart form, wherein the chart presentation form comprises sensitive information content, popular information content, publisher regional distribution information, emotion trend graphs, work release interaction trend and information source distribution;
B6. Aiming at each public opinion scheme, the network public opinion monitoring system analyzes and processes public opinion data from multiple dimensions, supports the generation of a data chart according to a single scheme set by a user, and the data chart comprises public opinion source distribution, keyword public opinion volume, content word cloud and sensitive classification related information;
B7. according to the monitoring scheme of the network public opinion monitoring system, different data of relevant days, weeks and months are obtained and counted into reports, so that a monitored company is helped to know the recent public opinion conditions, and the efficiency of public opinion treatment is improved;
B8. the network public opinion monitoring system collects whole network public opinion data and tracks and analyzes according to the overall propagation condition of public opinion information on each short video platform, and automatically generates a multi-dimensional comprehensive analysis report covering event trends, website statistics and propagation paths.
The OCR technology processing method in the step A4 comprises the following technology implementation steps
C1. Image preprocessing, namely, for better text line positioning and recognition, so that the accuracy of public opinion data recognition is improved;
C2. positioning all text lines of the document image, and converting text information into editable text information;
C3. correcting the recognition result according to rules and big data analysis, and improving the accuracy of character recognition;
C4. Correcting the recognition result according to the original text image, and improving the accuracy of character recognition;
C5. and restoring the identification result to the WEB interface according to the original layout of the original text image.
In the processing method of the data acquisition technology in the step A2, the monitored websites are continuously scanned through a massive address pool of IP addresses and simulating natural person access behaviors, acquired data are stored in a distributed storage service cluster, and all actions and behavior logs are recorded in a log server cluster. The collected data is transmitted to sub-modules of emotion judgment, natural language recognition and the like of data processing through a collected data interface.
In the screening processing of public opinion data information in the step A2, the public opinion information of each short video platform is collected and monitored and used for screening public opinion information, public opinion keywords are set to monitor public opinion, and the network public opinion monitoring system is matched with the corresponding keywords to collect and display the information; and (3) carrying out key acquisition and monitoring on short video users with high vermicelli quantity and/or high influence by establishing a key account pool, and monitoring public opinion information issued by the short video users.
The data acquisition module in the step A1 shown in fig. 4 performs internet short video information acquisition on the top bar, the microblog, the fast hand, the tremble, the beep and the redbook of today through a crawler server cluster, and transmits the information to the data processing module after being subjected to UR deduplication, collaborative crawler, known website template matching and unknown website automatic calculation, wherein the data processing of the data processing module in the step A1 comprises automatic summarization, noise calculation, text classification, text word segmentation, viewpoint extraction, region identification, sensitive discovery, hot spot calculation, burst calculation, event extraction and subject word extraction, and the management storage of the data management and storage module in the step A1 comprises data distribution, data storage, automatic backup, distributed indexing, query management and advanced calculation, and the data presentation of the data presentation module in the step A1 comprises user configuration, query request and data presentation of a front-end WEB interface.
In the data processing method shown in fig. 2, a user opens a system link and inputs an account password to log in the network public opinion monitoring system; according to different public opinion events, setting different keyword schemes, and collecting and monitoring short video information of platforms such as tremble voice, fast handedness, microblog, redbook, beep, present head, and the like by a system through a crawler technology; the system performs entity extraction, information duplication removal, feature extraction recognition, OCR recognition and the like on the acquired information, monitors the keyword content of each scheme, performs matching after extracting the keyword information, judges whether sensitive content exists in the information, and judges negative information if the sensitive content exists in the information; the network public opinion monitoring system performs research, judgment and analysis on information acquired in different schemes to generate a data chart for a user to check; according to the analyzed data chart, the network public opinion monitoring system can generate public opinion reports, including reports such as monthly report, weekly report and the like.
In the processing method of the above-mentioned data collection technology in step A2 shown in fig. 4, data collection is completed through the cooperation of the crawler server cluster, and the data collection is aimed at mass data such as today's headstripe, fast hand, microblog, tremble sound, redbook, curry and the like, and useful data information is screened out. And continuously scanning the monitored websites through a massive address pool of IP addresses and simulating natural person access behaviors, storing the acquired data in a distributed storage service cluster, and recording all actions and behavior logs to a log server cluster. The collected data is transmitted to sub-modules of emotion judgment, natural language recognition and the like of data processing through a collected data interface.
In the processing method of the data deduplication technology in step A3 shown in fig. 5, data deduplication is also called deduplication, which refers to finding and deleting duplicate data in a set of digital files, and only storing unique data units. At the same time of deleting, data reconstruction is considered, namely, although part of the content of the file is deleted, the complete content of the file is still reconstructed when needed, and index information between the file and the unique data unit is required to be reserved.
In the processing method of the OCR recognition technology in step A4 shown in fig. 6, the system acquires the data of the social media platform through the intelligent crawler system, downloads the video in batches in a high concurrence mode, and performs frame extraction to acquire the picture, the OCR system can rapidly and accurately locate the characters on the picture and recognize the characters in a natural background which are difficult to recognize, and the recognition accuracy rate is up to 95% or more.
The OCR technology comprises the following implementation steps:
1. image preprocessing is performed, so that text lines are positioned and recognized better, and recognition accuracy is improved;
2. positioning all text lines of the document image, and converting text information into editable text information;
3. correcting the recognition result according to rules and big data analysis, and improving the accuracy of character recognition;
4. And restoring the identification result to the web interface according to the original layout of the original text image.
In the processing method of the video feature recognition technology in step A4 shown in fig. 7, the system acquires data of a social media platform through an intelligent crawler system, downloads videos in batches in a high concurrency mode, and performs frame extraction to acquire pictures. Based on the theoretical basis of computer vision, namely convolutional neural network CNN, cyclic neural network RNN, intent mechanism, bidirectional LSTM and the like, model building of feature recognition, target detection, classification model, license plate recognition, OCR (optical character recognition) and the like.
The foregoing and construction describes the basic principles, principal features and advantages of the present invention product, as will be appreciated by those skilled in the art. The foregoing examples and description are provided to illustrate the principles of the invention and to provide various changes and modifications without departing from the spirit and scope of the invention as defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. A short video data processing system based on image processing technology monitoring, characterized in that: the system comprises a data source layer, a data analysis platform layer, an intelligent analysis platform layer, an application platform layer and a service presentation layer, wherein the data source layer comprises a brand platform library and a third party search engine, and the brand platform library comprises but is not limited to tremble, fast hands, red books, microblogs, today's top bars and beeps; the data analysis platform layer comprises a preprocessing layer and an information acquisition layer, the information acquisition layer is used for carrying out information acquisition processing on the data source layer, the preprocessing layer is used for carrying out information preprocessing on acquisition information obtained after the information acquisition processing on the information acquisition layer, and a short video public opinion information management database is established; the intelligent analysis platform layer comprises a statistical analysis unit and a search engine unit, the statistical analysis unit is used for constructing an analysis guide control service linkage command basic platform, the application platform layer carries out network public opinion management application on short video public opinion analysis data obtained by the data source layer, the data analysis platform layer and the intelligent analysis platform layer, a network public opinion monitoring system is constructed and formed, and the service presentation layer is used for presenting the data source layer, the data analysis platform layer, the intelligent analysis platform layer and the application platform layer and providing corresponding visual presentation for network public opinion supervision application command.
2. The short video data processing system based on image processing technology monitoring as recited in claim 1, wherein: the information acquisition layering comprises an acquisition source configuration module, a priority configuration module, an acquisition agent module, an acquisition group control module, an acquisition monitoring module and a data cleaning module; the acquisition source configuration module is used for facilitating the configuration acquisition of the short video public opinion information acquisition source by the network information mining engine, the priority configuration module is used for configuring priority storage levels of the acquired public opinion information, and the acquisition proxy module is used for transmitting the acquired short video public opinion data to the server at regular time and responding to monitoring requirements sent by the control console in real time; the acquisition group control module is used for uniformly controlling and managing a plurality of short video public opinion data targets; the acquisition monitoring module is used for monitoring and controlling the short video public opinion data field operation equipment acquired by monitoring; the data cleaning module is used for rechecking and verifying the collected public opinion data and deleting repeated public opinion data information. And correcting the error data information and providing consistency of public opinion data information.
3. The short video data processing system based on image processing technology monitoring as recited in claim 1, wherein: the preprocessing layering comprises a public opinion early warning module, a data summarizing module, an automatic abstracting module, a Chinese word segmentation module, a construction index module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module and a hotword extraction module; the public opinion early warning module is used for realizing automatic early warning according to preset early warning key words, and the data summarizing module is used for classifying and summarizing public opinion data records according to set standards; the automatic abstract module is used for automatically converting and generating short video public opinion brief compressed information, and the Chinese word segmentation module is used for carrying out Chinese word segmentation and automatically identifying short video public opinion sentence meanings; the system comprises an index building module, a similar content merging module, an increment synchronization module, an automatic classification module, a keyword extraction module, a data warehouse, a data storage module, a keyword extraction module and a keyword extraction module, wherein the index building module is used for creating all configurations related to definition and index, the similar content merging module is used for processing and judging similar short video public opinion data, the increment synchronization module is used for synchronizing daily increment change data to the data storage module, the automatic classification module is used for carrying out index sorting processing by self-defining classification standards, the keyword extraction module is used for extracting candidate words and judging output keywords; the hot word extraction module is used for extracting high-frequency hot words.
4. The short video data processing system based on image processing technology monitoring as recited in claim 1, wherein: the statistical analysis unit comprises a media attention degree module, an automatic clustering module, a word group relation construction module, a trend analysis module, a statistical report module, a theme research and judgment module, an analysis model construction module, a hot event module, a hot word analysis module, a propagation track module and an industry index release module; the media attention module is used for classifying attributes of different short video public opinion network platforms and account numbers, judging whether the network public opinion media is available, and counting and refining repeatedly-appearing public opinion attention words and sentences and titles; the automatic clustering module is used for automatically gathering and classifying public opinion data with the same and similar content, and the word group relation building module is used for automatically extracting and judging and building public opinion text center meaning and attribute library; the trend analysis module is used for calculating a historical development track of the public opinion event by taking the occurrence frequency of the public opinion event and the self-defined period as units, and calculating a future development trend; the statistical report module is used for presenting the processed and counted public opinion data in a form, and the topic studying and judging module is used for capturing the public opinion text center thought and judging the attribute of the public opinion text; the analysis model construction module is used for customizing option weights and analyzing and constructing different types of public opinion data; the trending event module is used for counting high-frequency occurrence public opinion events and displaying the high-frequency occurrence public opinion events; the hot word analysis module is used for extracting high-frequency occurrence words in all public opinion events and displaying the high-frequency occurrence words, and the propagation track module is used for tracing the historical development track of the public opinion events and calculating the propagation track of the research and judgment public opinion events; the industry index issuing module is used for reflecting index indexes of development conditions of various industries in the market.
5. The short video data processing system based on image processing technology monitoring as recited in claim 1, wherein: the application platform layer comprises an Internet application unit, a classification detection unit, a public opinion report unit, a hot public opinion unit, a keyword configuration unit, an early warning configuration unit, an address book management unit and a user management unit; the Internet application unit is used for executing collection and summarization of Internet application on the application platform, the classification detection unit is used for executing short video public opinion information to perform independent classification detection, the public opinion report unit is used for executing public opinion information report generation, the hot public opinion unit is used for executing high-frequency public opinion presentation in a specific time period, the keyword configuration unit is used for executing edit configuration according to keywords and word input rejection grammar, the early warning configuration unit is used for executing configuration push early warning information, the address book management unit is used for managing contacts associated with the short video data processing system, and the user management unit is used for executing user list and user authority under a management account.
6. A short video data processing method based on image processing technology monitoring is characterized in that: comprising the following data processing method
A1. The data architecture processing method comprises the following steps: dividing Internet mainstream short video platform public opinion videos into a data acquisition module, a data processing module, a data management and storage module and a data display module by the network public opinion monitoring system, and carrying out data analysis and management after the Internet public opinion data acquisition to realize data visualization;
A2. the data acquisition technology processing method comprises the following steps: the data acquisition is completed through the cooperation of a crawler server cluster, and the data acquisition is performed through the mass public opinion data obtained by monitoring the current headbands, fast hands, microblogs, tremble sounds, redbooks and beep knots, and screening useful public opinion data information; the network public opinion is monitored by setting public opinion keywords, and the network public opinion monitoring system is matched with the corresponding keywords and then acquires the information for display;
A3. the processing method of the data deduplication technology comprises the following steps: the data deduplication is also called deduplication, which is to find out and delete duplicate public opinion data in a digital file set, and only store unique data units; while deleting, consider data reconstruction, namely although some content of the file is deleted, when needing, still reconstruct the complete file content, this needs to keep the index information between file and unique data unit;
A4, OCR recognition technology processing method: the system acquires data of a social media platform through the intelligent crawler system, downloads videos in batches in a high concurrency mode, and performs frame extraction to acquire pictures, so that the OCR system can quickly and accurately locate characters on the pictures and identify the characters; OCR recognition is carried out on scenes and objects appearing in the short video, and meanwhile, OCR recognition extraction is carried out on subtitle content in the short video, so that the full-scale acquisition and monitoring of public opinion information are facilitated;
A5. the processing method of the video feature recognition technology comprises the following steps: acquiring data of a social media platform through an intelligent crawler system, downloading videos in batches in a high concurrency mode, and performing frame extraction to acquire pictures;
A6. the construction method of the short video network public opinion monitoring system for enterprise image public opinion supervision comprises the following steps: through the monitoring of enterprise image related public opinion through the popular short video platform including but not limited to trembling, microblog, fast handhold, today's head, sergey and small red books, for the public opinion information related to enterprise management, service response, power supply and staff and company brand image, through the full-automatic real-time monitoring, analysis and early warning of 7×24 hours, the internet public opinion risk control capability of the monitored company is systematically improved, the monitored company is assisted to find network public opinion more efficiently, timely and comprehensively, and a precedent is acquired for public opinion treatment and guidance;
A7. The enterprise image public opinion analysis processing method comprises the following steps: through the professional analysis capability of the network public opinion monitoring system, accurate public opinion information retrieval and deep analysis reference are obtained, a monitored company is helped to master the network public opinion trend of a short video platform in time, important events with great influence are rapidly found and rapidly processed, public opinion and publicity are guided from the front, decision basis is provided for public opinion management, brand images of the company are better maintained by the power-assisted monitored company, and a good public opinion environment is provided for enterprise development.
7. The short video data processing method based on image processing technology monitoring according to claim 6, wherein: the data processing method comprises the following specific processing modes
B1. Setting a keyword scheme to inquire and monitor public opinion information, and configuring different keywords according to actual service requirements to acquire related information in a short video platform; the method provides a convenience scheme for public opinion hotspot discovery and key public opinion monitoring;
B2. after the scheme is set, public opinion data monitored under the scheme can be checked in an information list;
B3. setting a shortcut search, and setting one or two simple keywords for preliminary retrieval of information;
B4. short video users with large vermicelli quantity, high liveness and/or large influence are screened according to service requirements to generate key account pools, corresponding key users are added according to service precipitation to monitor, key accounts of different platforms are monitored and collected, public opinion information issued by key account Chi Duanshi frequency users is monitored, and further AI analysis is achieved;
B5. Visual data: analyzing and judging the scheme from multiple dimensions through a visual data large screen, and presenting the analyzed data in a chart form, wherein the chart presentation form comprises sensitive information content, popular information content, publisher regional distribution information, emotion trend graphs, work release interaction trend and information source distribution;
B6. aiming at each public opinion scheme, the network public opinion monitoring system analyzes and processes public opinion data from multiple dimensions, supports the generation of a data chart according to a single scheme set by a user, and the data chart comprises public opinion source distribution, keyword public opinion volume, content word cloud and sensitive classification related information;
B7. according to the monitoring scheme of the network public opinion monitoring system, different data of relevant days, weeks and months are obtained and counted into reports, so that a monitored company is helped to know the recent public opinion conditions, and the efficiency of public opinion treatment is improved;
B8. the network public opinion monitoring system collects whole network public opinion data and tracks and analyzes according to the overall propagation condition of public opinion information on each short video platform, and automatically generates a multi-dimensional comprehensive analysis report covering event trends, website statistics and propagation paths.
8. The short video data processing method based on image processing technology monitoring according to claim 6, wherein: the OCR technology processing method in the step A4 comprises the following technology implementation steps
C1. Image preprocessing, namely, for better text line positioning and recognition, so that the accuracy of public opinion data recognition is improved;
C2. positioning all text lines of the document image, and converting text information into editable text information;
C3. correcting the recognition result according to rules and big data analysis, and improving the accuracy of character recognition;
C4. correcting the recognition result according to the original text image, and improving the accuracy of character recognition;
C5. and restoring the identification result to the WEB interface according to the original layout of the original text image.
9. The short video data processing method based on image processing technology monitoring according to claim 6, wherein: in the processing method of the data acquisition technology in the step A2, the monitored websites are continuously scanned through a massive address pool of IP addresses and simulating natural person access behaviors, acquired data are stored in a distributed storage service cluster, and all actions and behavior logs are recorded in a log server cluster. The collected data is transmitted to sub-modules of emotion judgment, natural language recognition and the like of data processing through a collected data interface.
10. The short video data processing method based on image processing technology monitoring according to claim 6, wherein: the data acquisition module in the step A1 is used for carrying out Internet short video information acquisition on the present headbars, microblogs, fast hands, tremble sounds, beeps and redbooks through a crawler server cluster, and transmitting the information to the data processing module after UR deduplication, collaborative crawler, known website template matching and unknown website automatic calculation, wherein the data processing of the data processing module in the step A1 comprises automatic summarization, noise calculation, text classification, text word segmentation, viewpoint extraction, region identification, sensitive discovery, hot spot calculation, burst calculation, event extraction and subject word extraction, the management storage of the data management and storage module in the step A1 comprises data distribution, data storage, automatic backup, distributed indexing, query management and advanced calculation, and the data display of the data display module in the step A1 comprises user configuration, query request and data presentation of a front-end WEB interface.
CN202310494088.6A 2023-05-05 2023-05-05 Short video data processing system and processing method based on image processing technology monitoring Pending CN117235343A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310494088.6A CN117235343A (en) 2023-05-05 2023-05-05 Short video data processing system and processing method based on image processing technology monitoring

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310494088.6A CN117235343A (en) 2023-05-05 2023-05-05 Short video data processing system and processing method based on image processing technology monitoring

Publications (1)

Publication Number Publication Date
CN117235343A true CN117235343A (en) 2023-12-15

Family

ID=89083261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310494088.6A Pending CN117235343A (en) 2023-05-05 2023-05-05 Short video data processing system and processing method based on image processing technology monitoring

Country Status (1)

Country Link
CN (1) CN117235343A (en)

Similar Documents

Publication Publication Date Title
CN111460252B (en) Automatic search engine method and system based on network public opinion analysis
CN110705288A (en) Big data-based public opinion analysis system
CN106407078B (en) Client performance monitoring device and method based on information exchange
CN112070338A (en) Enterprise internal auxiliary auditing method
CN111461538A (en) Performance management system based on big data analysis
CN113139141A (en) User label extension labeling method, device, equipment and storage medium
CN111723256A (en) Government affair user portrait construction method and system based on information resource library
CN111695014A (en) Method, system, device and storage medium for automatically generating manuscripts based on AI (artificial intelligence)
Zhang Application of data mining technology in digital library.
CN113360599A (en) Multi-source heterogeneous information convergence cooperative processing platform based on content identification
CN104834739A (en) Internet information storage system
CN116228319A (en) Intelligent advertisement putting method and system based on buried points
CN108280213A (en) A kind of analysis system of big data
US20160188676A1 (en) Collaboration system for network management
CN117389998B (en) Data storage method and device based on large model
CN112306992B (en) Big data platform system based on internet
CN117171244A (en) Enterprise data management system based on data middle platform construction and data analysis method thereof
CN112925899A (en) Ranking model establishing method, case clue recommending device and medium
CN111026940A (en) Network public opinion and risk information monitoring system and electronic equipment for power grid electromagnetic environment
CN117235343A (en) Short video data processing system and processing method based on image processing technology monitoring
CN114417221A (en) Big data operation system based on Internet and implementation method thereof
CN210804423U (en) Website information acquisition and release platform system
CN115130453A (en) Interactive information generation method and device
CN113947423A (en) Market analysis method based on big data
KR20210045172A (en) Big Data Management and System for Livestock Disease Outbreak Analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination