WO2019166858A1 - File hosting service in cloud - Google Patents
File hosting service in cloud Download PDFInfo
- Publication number
- WO2019166858A1 WO2019166858A1 PCT/IB2018/051301 IB2018051301W WO2019166858A1 WO 2019166858 A1 WO2019166858 A1 WO 2019166858A1 IB 2018051301 W IB2018051301 W IB 2018051301W WO 2019166858 A1 WO2019166858 A1 WO 2019166858A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- file
- files
- users
- cluster
- title
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/25—Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
- H04N21/258—Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
- H04N21/25866—Management of end-user data
- H04N21/25875—Management of end-user data involving end-user authentication
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/27—Server based end-user applications
- H04N21/274—Storing end-user multimedia data in response to end-user request, e.g. network recorder
- H04N21/2743—Video hosting of uploaded data from client
Definitions
- Different users after authentication by providing the unique file name can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file.
- Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name.
- the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last.
- semantic parsing of the phrase or sentence we form a cluster of files with similar semantic context with respect to the title and description of the files.
- We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file classification.
- the second criterion is that clusters which have positions next to each other on the map (called “neighbours”) have similar the titles or descriptions for all its files to the neighbouring cluster.
- This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics.
- transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Graphics (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Here we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name can view, comment or like different files. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name.
Description
File Hosting Service in Cloud
In this invention we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name (files when uploaded are stored in cloud with unique file name given by the user) can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name. In case of a keyword the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last. Similarly for long phrases or sentences we do semantic parsing of the phrase or sentence to get a formal representation of its meaning. Here for this case we form a cluster of files with similar semantic context with respect to the title and description of the files. We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file classification. We can further visualize the documents or files by representing each file by a vector which specifies how many times each word occurs in the title or description of the file (the word frequencies). These counts are weighted to reflect the importance of each word. The weighting is the inverse of the log of occurrence of the each word in different file’s titles or description (the inverse document frequency). This vector of weighted counts is called a "bag of words"
representation. Words from a specific list of "stop words" (such as function words) are not included in the representation. After this given a set of document vectors (vectors for terms occurring in title or description of the file) we apply the Self Organising Maps algorithm which helps in finding a partitioning of
those files into clusters and the range of files in the collection can then be visualized by displaying each cluster's topic at the cluster's position on a 2- dimensional map. The algorithm searches the space of clustering and the space of position assignments simultaneously, trying to find a global optimum for two criteria. The first criterion is that the titles or descriptions of the files within a given cluster are similar to each other. This property means that each cluster has a coherent topic. The second criterion is that clusters which have positions next to each other on the map (called "neighbours") have similar the titles or descriptions for all its files to the neighbouring cluster. This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics. We return the files from a cluster to the user for the long phrase or sentence query and rank them such that the file in the cluster with highest number of semantically related relevant terms is first and the file with lowest number of semantically related relevant terms is last. Finally we do transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc. Typically for each file if space is not a constraint we will store the transcoded format for rendering purposes on various user devices so that we do not have to transcode at run time.
Claims
1. In this invention we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name (files when uploaded are stored in cloud with unique file name given by the user) can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name. In case of a keyword the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last. Similarly for long phrases or sentences we do semantic parsing of the phrase or sentence to get a formal representation of its meaning. Here for this case we form a cluster of files with similar semantic context with respect to the title and description of the files. We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file
classification. We can further visualize the documents or files by representing each file by a vector which specifies how many times each word occurs in the title or description of the file (the word frequencies). These counts are weighted to reflect the importance of each word. The weighting is the inverse of the log of occurrence of the each word in different file’s titles or description (the inverse document frequency).
This vector of weighted counts is called a "bag of words" representation. Words from a specific list of "stop words" (such as function words) are not included in the representation. After this given a set of document vectors (vectors for terms occurring in title or description of the file) we apply the Self Organising Maps algorithm which helps in finding a partitioning of those files into clusters and the range of files in the collection can then be visualized by displaying each cluster's topic at the cluster's position on a 2-dimensional map. The algorithm searches the space of clustering and the space of position assignments simultaneously, trying to find a global optimum for two criteria. The first criterion is that the titles or descriptions of the files within a given cluster are similar to each other. This property means that each cluster has a coherent topic. The second criterion is that clusters which have positions next to each other on the map (called "neighbours") have similar the titles or
descriptions for all its files to the neighbouring cluster. This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics. We return the files from a cluster to the user for the long phrase or sentence query and rank them such that the file in the cluster with highest number of semantically related relevant terms is first and the file with lowest number of semantically related relevant terms is last. Finally we do transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc. Typically for each file if space is not a constraint we will store the transcoded format for rendering purposes on various user devices so that we do not have to transcode at run time. The above novel technique of providing file hosting services in the cloud is the claim for this invention.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2018/051301 WO2019166858A1 (en) | 2018-03-01 | 2018-03-01 | File hosting service in cloud |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/IB2018/051301 WO2019166858A1 (en) | 2018-03-01 | 2018-03-01 | File hosting service in cloud |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019166858A1 true WO2019166858A1 (en) | 2019-09-06 |
Family
ID=67805241
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2018/051301 WO2019166858A1 (en) | 2018-03-01 | 2018-03-01 | File hosting service in cloud |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2019166858A1 (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015000083A1 (en) * | 2013-07-05 | 2015-01-08 | Anysolution, Inc. | System and method for ranking online content |
CN104778270A (en) * | 2015-04-24 | 2015-07-15 | 成都汇智远景科技有限公司 | Storage method for multiple files |
-
2018
- 2018-03-01 WO PCT/IB2018/051301 patent/WO2019166858A1/en active Application Filing
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2015000083A1 (en) * | 2013-07-05 | 2015-01-08 | Anysolution, Inc. | System and method for ranking online content |
CN104778270A (en) * | 2015-04-24 | 2015-07-15 | 成都汇智远景科技有限公司 | Storage method for multiple files |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220044139A1 (en) | Search system and corresponding method | |
Cantador et al. | Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011) | |
Ding et al. | The utility of linguistic rules in opinion mining | |
Noruzi | Folksonomies: Why do we need controlled vocabulary? | |
Bergamaschi et al. | Comparing LDA and LSA topic models for content-based movie recommendation systems | |
Le et al. | Unsupervised keyphrase extraction: Introducing new kinds of words to keyphrases | |
Arguello | Aggregated search | |
US9727617B1 (en) | Systems and methods for searching quotes of entities using a database | |
US20170011643A1 (en) | Ranking of segments of learning materials | |
Fernandez et al. | Linking data across universities: an integrated video lectures dataset | |
Hu et al. | Enabling semantic search and knowledge discovery for ArcGIS Online: A linked-data-driven approach | |
Abdelkader et al. | Brands in newsstand: Spatio-temporal browsing of business news | |
Sateli et al. | Semantic user profiles: Learning scholars’ competences by analyzing their publications | |
KR101478016B1 (en) | Apparatus and method for information retrieval based on sentence cluster using term co-occurrence | |
Bi et al. | Iterative relevance feedback for answer passage retrieval with passage-level semantic match | |
Liu et al. | Event-based cross media question answering | |
Bolettieri et al. | Automatic metadata extraction and indexing for reusing e-learning multimedia objects | |
Juric et al. | Discovering links between political debates and media | |
AlNoamany | Using web archives to enrich the live web experience through storytelling | |
Messina et al. | Hyper Media News: a fully automated platform for large scale analysis, production and distribution of multimodal news content | |
Kurihara et al. | Target-topic aware Doc2Vec for short sentence retrieval from user generated content | |
Mekthanavanh et al. | Social web video clustering based on multi-modal and clustering ensemble | |
Strobel et al. | Metadata for scientific audiovisual media: current practices and perspectives of the TIB| AV-Portal | |
WO2019166858A1 (en) | File hosting service in cloud | |
Pak et al. | Normalization of term weighting scheme for sentiment analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18908055 Country of ref document: EP Kind code of ref document: A1 |