CN109063198B

CN109063198B - Multi-dimensional visual search recommendation system for fusing media resources

Info

Publication number: CN109063198B
Application number: CN201811052017.6A
Authority: CN
Inventors: 潘宁宁; 张莹; 高雅萍; 安钧; 杨川疆; 胡瀛斌; 章丽兰; 傅婷婷; 江冠阳; 陆蕴超; 朱浩路
Original assignee: Radio and Television Group of Zhejiang
Current assignee: Radio and Television Group of Zhejiang
Priority date: 2018-09-10
Filing date: 2018-09-10
Publication date: 2022-02-11
Anticipated expiration: 2038-09-10
Also published as: CN109063198A

Abstract

The invention discloses a multidimensional visual search recommendation system for integrating media resources, which combines a personalized recommendation technology and a visual technology, a user inputs query keywords on a user interface, an open source full-text search engine Lucene constructs an index for a media resource database, and JSON data with different formats is constructed for search result data, fed back to four different visual tools, namely hierarchy, time, map and word cloud, and displayed on the user interface; after the user interacts with the visual interface, the recommending module recommends the media resource with high similarity for the user according to the clicked media resource. The system provides visual display of four aspects of search results, and simultaneously recommends the media material resources with the highest relevance for the user according to the user interaction behaviors.

Description

Multi-dimensional visual search recommendation system for fusing media resources

Technical Field

The invention relates to the technical field of computers, in particular to a multi-dimensional visual search recommendation system for integrating media resources, which can quickly retrieve a media resource base according to three dimensions of hierarchy, time and space and visually display a search result.

Background

At present, each large television station in China establishes an independent media asset library to store a large amount of video, audio, text and picture resources. Aiming at the cross-media resources, each large television station also establishes a media asset management system to reasonably utilize, uniformly archive, intensively manage and reasonably circulate the media resources, so that the utilization value of the program resources is improved. The main functions of the media asset management system comprise functions of retrieval, downloading, cataloguing, program warehousing, background management and the like, most of the query functions of the media asset management system are focused on query in various modes, a result set obtained by query is only displayed in a list mode, the display mode is monotonous, the correlation among resources is not clear, interaction with a user is lacked, on the other hand, media asset library resources are continuously increased from a long time, for the query of continuously increased mass data, the current query technology is possibly difficult to support, query service needs to be optimized according to business scenes urgently, on the other hand, when the content of the searched result set is abundant, a user spends much time browsing and searching related content one by one, and therefore, the recommendation of high-quality content for the user through user interaction behavior is also one of the needs to be solved urgently.

Disclosure of Invention

Aiming at the defects of the prior art, the invention provides a multi-dimensional visual search recommendation system for fusing media resources, which can quickly retrieve the material resources by optimizing query conditions in a continuously growing massive media material resource library, visually display various search results to a user, and make recommendation of related resources for text similarity of resource manuscripts, thereby improving the user interactivity and being beneficial to journalists and editors to obtain the inspiration of selecting questions from the manuscripts.

The multidimensional visual search recommendation system for fusing media resources combines the personalized recommendation technology and the visualization technology, and is mainly divided into four parts, namely a user interface, data query and processing, a visualization module and a recommendation module. And the user inputs a query keyword on a user interface, an index is built for the media asset database through the Lucene, and the keyword input by the user returns a search result through Lucene query. JSON data in different formats are constructed for the search result data, fed back to four different visualization tools, namely, hierarchy, time, map and word cloud, and displayed on a user interface. And after the user interacts with the visual interface, recommending the media resources with high similarity for the user according to the clicked media resources.

(1) User interaction interface

The user interaction interface is divided into a search area, a hierarchy display area, a geographical display area, a timeline display area, a related resource list area recommended to the user according to user interaction, video display of a single resource selected by the user and manuscript word cloud display. And data transmission is carried out between the client and the server through JSON format data. The user completes all operations in a complete interface, and the complexity of the operations is reduced.

(2) Keyword search module

The server side uses an open source full-text search engine Lucene to index and construct commonly used fields in a media asset database, a Chinese word segmentation packet IKAnalyzer is configured for Lucene, a user inputs a query keyword through a user interface, the Lucene returns a corresponding search result according to an index after acquiring the keyword, performs data preprocessing, and constructs JSON data in different formats according to different display modes.

(3) Visualization module

And the visualization module is divided into four modes to display a search result set, namely hierarchical display based on the television station organization structure in Zhejiang, map display based on the geographic coordinate information of the resources, time axis display according to the time sequence of each resource in the result set, and word cloud display after word segmentation according to the manuscript of each resource after clicking the single resource.

(4) Recommendation module

And after the user clicks the single resource, comparing the manuscript information of other resources in the search result set through a vector space model algorithm according to the manuscript information of the resource clicked by the user, and returning the text word cloud display of the single resource. And after similarity calculation, resources with high relevance are obtained, and are recommended to the user, and in addition, the user can select a resource ranking list required by the user according to the download number and the browsing number ranking.

Has the advantages that:

the visual search recommendation system provided by the invention provides visual display of four aspects of search results, and recommends media material resources with highest relevance for the user according to the user interaction behavior.

Drawings

FIG. 1 is a diagram of a network architecture of a client and server in accordance with the present invention.

Fig. 2 is a flow chart of the operation of the system of the present invention.

Detailed Description

The technical solution of the present invention is further described in more detail with reference to the accompanying drawings and specific embodiments. It is to be understood that the described embodiments are merely a few embodiments of the invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, shall fall within the scope of protection of the present invention.

Fig. 1 is a schematic diagram of a network architecture of the system of the present invention. In this embodiment, the system includes the internet 10, a stand-alone client 20, a router 30, a firewall 40, a background Tomcat server 50, an Oracle database server 60, and a MySql database server 70. The stand-alone client 20 can be connected to the internet 10 and connected to a background Tomcat server 50 deployed in the intranet through the router 30 and the firewall 40, and the background Tomcat server 50 can be connected to an Oracle database 60 storing a central media asset database and a MySql database 70 storing geographic information and user information.

The operation steps are as follows:

fig. 2 is a schematic diagram of an operation flow provided by the embodiment of the present invention. The flow method comprises the following steps:

step S1: the interface of the visual search recommendation system is opened on the stand-alone client 20 connected to the internet 10.

Step S2: and inputting key words in the search box and inquiring.

Step S3: and the background returns the search result set by searching the created Lucene index file.

Step S4: and the background encapsulates the returned search result data according to three directions of hierarchy, geography and time, and returns the data to the foreground interface in JSON format through the API interface.

Step S5: and the foreground acquires the data and takes out the hierarchical data.

Step S6: the user can select the required video and audio resources in the hierarchical data according to the channel-column-reporter-video four-layer structure. The user clicks on the title of a certain audio-visual resource.

Step S7: a user sets a button for adding geographic information in front of each video and audio resource in the hierarchical data, so that a reporter can conveniently edit the video and audio resources without geographic information in the central media resources to manually add the geographic information, including geographic positions and description information.

Step S8, after step S6, the backend Tomcat server 50 obtains the manuscript information of the piece of video and audio resource clicked by the user.

Step S9: the background Tomcat server 50 performs word segmentation on the document information, uses IKAnalyzer as a chinese word segmentation packet, and performs sorting according to the word frequency of the words appearing in the document after word segmentation.

Step 10: and returning the sequenced participles to the interface of the client 20, displaying word clouds through cloud.

Step 11: after acquiring the document information in step S8, the Tomcat server 50 performs vector similarity calculation between the document and the document information of other videos and audios in the search result set in step S3, obtains a similarity value, and performs similarity ranking.

Step 12: and displaying the recommended video and audio contents according to the returned sorted result, and carrying out sorting display according to the video and audio browsing amount and the video and audio clicking amount in the recommending interface.

Step 13: after the

steps

10 and 12, the text word cloud, the related recommended video list, and the specific information of the video and audio, such as title, high definition, reporter, and creation time, are displayed on the foreground interface of the client 20.

Step 14: step S7 is to manually add video and audio geographic information and store the information in the geographic data of the MySql database server 70 in the background.

Step 15: and when the user clicks a single video and audio title in the layer data, marking the specific geographic information and the description information of the video in the map, and visualizing the geographic data.

Step 16: the foreground acquires the data and takes out the time data.

And step 17: the time data is visualized in the form of a timeline. The timeline shows that after a user searches for a keyword to obtain a search result, each video resource in the search result is sorted from far to near in the timeline. The relationship between the video search result sets is directly displayed in the time dimension. The present system exhibits implementation using TimelineJs. And searching key words by a user, sending a time line data request to the data layer through the logic layer, forming JSON data by the data layer, storing the JSON data into a JSON file, and displaying a time line by the presentation layer through loading the JSON file.

Claims

1. A multi-dimensional visual search recommendation method for syncretizing media resources is characterized by comprising the following steps:

step S1: opening an interface of a visual search recommendation system on a stand-alone client connected with the Internet;

step S2: inputting key words in a search box and inquiring;

step S3: the background returns a search result set by searching the created Lucene index file;

step S4: the background encapsulates the returned search result data according to three directions of hierarchy, geography and time, and returns the data to the foreground interface in JSON format through the API interface;

step S5: the foreground acquires data and takes out hierarchical data;

step S6: the user can select the required video and audio resources according to the channel-column-reporter-video four-layer structure in the hierarchical data, and the user clicks the title of a certain video and audio resource;

step S7: in the hierarchical data, a button for adding geographic information is arranged in front of each video and audio resource, so that a user can conveniently edit and manually add the geographic information, including geographic position and description information, to the video and audio resources without the geographic information in the central media resources;

step S8, after step S6, the background Tomcat server obtains the manuscript information of the video and audio resource clicked by the user;

step S9: the background Tomcat server carries out word segmentation operation on the manuscript information, uses IKAnalyzer as a Chinese word segmentation packet, and carries out sequencing according to the word frequency of the words in the manuscript after word segmentation;

step S10: returning the sorted participles to a client interface, displaying word clouds through cloud.js, and according to the sorting of word frequencies, words with higher word frequencies are ranked, words with higher ranks are displayed in a larger word proportion;

step S11: after the step S8, the Tomcat server performs vector similarity calculation between the document information and the document information of other videos and audios in the search result set in the step S3 to obtain a similarity value, and performs ranking from high to low;

step S12: displaying corresponding video and audio contents in sequence according to the returned sequencing result, and in an interface of the sequencing result, sequencing and displaying according to video and audio browsing amount and video and audio clicking amount;

step S13: displaying the related recommended video list obtained in the step S12, the manuscript word cloud of the single video and audio obtained in the step S10, and the specific information of the video and audio, such as title, high definition, reporter, and creation time, on a foreground interface of the client;

step S14: after the video and audio geographic information is manually added in the step S7, the video and audio geographic information is directly stored in the geographic data of the MySql database server at the background;

step S15: when a user clicks a single video and audio title in the hierarchical data, marking specific geographic information and description information of the video and audio in a map, and visualizing the geographic data;

step S16: the foreground acquires data and takes out time data;

step S17: the visualization of time data in a time line form is realized by using TimelineJs; the timeline shows that after a user searches for a keyword to obtain a search result, each video resource in the search result is sorted from far to near in the timeline.