WO2019166858A1 - File hosting service in cloud - Google Patents

File hosting service in cloud Download PDF

Info

Publication number
WO2019166858A1
WO2019166858A1 PCT/IB2018/051301 IB2018051301W WO2019166858A1 WO 2019166858 A1 WO2019166858 A1 WO 2019166858A1 IB 2018051301 W IB2018051301 W IB 2018051301W WO 2019166858 A1 WO2019166858 A1 WO 2019166858A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
files
users
cluster
title
Prior art date
Application number
PCT/IB2018/051301
Other languages
French (fr)
Inventor
Pratik Sharma
Original Assignee
Pratik Sharma
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pratik Sharma filed Critical Pratik Sharma
Priority to PCT/IB2018/051301 priority Critical patent/WO2019166858A1/en
Publication of WO2019166858A1 publication Critical patent/WO2019166858A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25866Management of end-user data
    • H04N21/25875Management of end-user data involving end-user authentication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/27Server based end-user applications
    • H04N21/274Storing end-user multimedia data in response to end-user request, e.g. network recorder
    • H04N21/2743Video hosting of uploaded data from client

Definitions

  • Different users after authentication by providing the unique file name can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file.
  • Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name.
  • the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last.
  • semantic parsing of the phrase or sentence we form a cluster of files with similar semantic context with respect to the title and description of the files.
  • We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file classification.
  • the second criterion is that clusters which have positions next to each other on the map (called “neighbours”) have similar the titles or descriptions for all its files to the neighbouring cluster.
  • This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics.
  • transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Graphics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Here we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name can view, comment or like different files. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name.

Description

File Hosting Service in Cloud
In this invention we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name (files when uploaded are stored in cloud with unique file name given by the user) can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name. In case of a keyword the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last. Similarly for long phrases or sentences we do semantic parsing of the phrase or sentence to get a formal representation of its meaning. Here for this case we form a cluster of files with similar semantic context with respect to the title and description of the files. We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file classification. We can further visualize the documents or files by representing each file by a vector which specifies how many times each word occurs in the title or description of the file (the word frequencies). These counts are weighted to reflect the importance of each word. The weighting is the inverse of the log of occurrence of the each word in different file’s titles or description (the inverse document frequency). This vector of weighted counts is called a "bag of words"
representation. Words from a specific list of "stop words" (such as function words) are not included in the representation. After this given a set of document vectors (vectors for terms occurring in title or description of the file) we apply the Self Organising Maps algorithm which helps in finding a partitioning of those files into clusters and the range of files in the collection can then be visualized by displaying each cluster's topic at the cluster's position on a 2- dimensional map. The algorithm searches the space of clustering and the space of position assignments simultaneously, trying to find a global optimum for two criteria. The first criterion is that the titles or descriptions of the files within a given cluster are similar to each other. This property means that each cluster has a coherent topic. The second criterion is that clusters which have positions next to each other on the map (called "neighbours") have similar the titles or descriptions for all its files to the neighbouring cluster. This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics. We return the files from a cluster to the user for the long phrase or sentence query and rank them such that the file in the cluster with highest number of semantically related relevant terms is first and the file with lowest number of semantically related relevant terms is last. Finally we do transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc. Typically for each file if space is not a constraint we will store the transcoded format for rendering purposes on various user devices so that we do not have to transcode at run time.

Claims

File Hosting Service in Cloud Following is the claim for this invention: -
1. In this invention we provide a file hosting service in the cloud where for different users we provide unique identification and password and users can upload media content like video, audio, static web pages or files, documents or presentations, etc. for sharing with other users who can play back media content or retrieve the documents on their devices. Users upload the above mentioned various kinds of files along with their title, short description and any authors or owners if applicable, etc. Different users after authentication by providing the unique file name (files when uploaded are stored in cloud with unique file name given by the user) can view, comment or like different files and we will maintain per file list of users who have viewed or viewing the document, list of users who have comments along with their comments either in alphabetical order of user identification or the chronological order of comments made by different users along with priority with filter available for the same and the list of users who have liked the file. Users can also do a search query with a keyword or a long phrase or sentence for retrieving file or files instead of providing file name. In case of a keyword the file hosting service will perform a full-text search only on title, description, and author or owner as applicable for all files and return the list of relevant files to the user ranked in such a manner that file with highest frequency of keyword is first and the file with lowest frequency of keyword is last. Similarly for long phrases or sentences we do semantic parsing of the phrase or sentence to get a formal representation of its meaning. Here for this case we form a cluster of files with similar semantic context with respect to the title and description of the files. We do the above by performing Natural Language Processing on the title and description of all files by doing identifying important words with syntactic analysis and by applying the Word Sense Disambiguation System for better file
classification. We can further visualize the documents or files by representing each file by a vector which specifies how many times each word occurs in the title or description of the file (the word frequencies). These counts are weighted to reflect the importance of each word. The weighting is the inverse of the log of occurrence of the each word in different file’s titles or description (the inverse document frequency). This vector of weighted counts is called a "bag of words" representation. Words from a specific list of "stop words" (such as function words) are not included in the representation. After this given a set of document vectors (vectors for terms occurring in title or description of the file) we apply the Self Organising Maps algorithm which helps in finding a partitioning of those files into clusters and the range of files in the collection can then be visualized by displaying each cluster's topic at the cluster's position on a 2-dimensional map. The algorithm searches the space of clustering and the space of position assignments simultaneously, trying to find a global optimum for two criteria. The first criterion is that the titles or descriptions of the files within a given cluster are similar to each other. This property means that each cluster has a coherent topic. The second criterion is that clusters which have positions next to each other on the map (called "neighbours") have similar the titles or
descriptions for all its files to the neighbouring cluster. This property means that the topics of clusters change continuously as one moves across the map, making it easier for a viewer to understand the range of files in the collection than would be possible with an unstructured list of topics. We return the files from a cluster to the user for the long phrase or sentence query and rank them such that the file in the cluster with highest number of semantically related relevant terms is first and the file with lowest number of semantically related relevant terms is last. Finally we do transcoding of the media file or any file from source format into versions that will play back or rendered on user devices like smartphones, tablets, personal computers, etc. Typically for each file if space is not a constraint we will store the transcoded format for rendering purposes on various user devices so that we do not have to transcode at run time. The above novel technique of providing file hosting services in the cloud is the claim for this invention.
PCT/IB2018/051301 2018-03-01 2018-03-01 File hosting service in cloud WO2019166858A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/IB2018/051301 WO2019166858A1 (en) 2018-03-01 2018-03-01 File hosting service in cloud

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/IB2018/051301 WO2019166858A1 (en) 2018-03-01 2018-03-01 File hosting service in cloud

Publications (1)

Publication Number Publication Date
WO2019166858A1 true WO2019166858A1 (en) 2019-09-06

Family

ID=67805241

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2018/051301 WO2019166858A1 (en) 2018-03-01 2018-03-01 File hosting service in cloud

Country Status (1)

Country Link
WO (1) WO2019166858A1 (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015000083A1 (en) * 2013-07-05 2015-01-08 Anysolution, Inc. System and method for ranking online content
CN104778270A (en) * 2015-04-24 2015-07-15 成都汇智远景科技有限公司 Storage method for multiple files

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015000083A1 (en) * 2013-07-05 2015-01-08 Anysolution, Inc. System and method for ranking online content
CN104778270A (en) * 2015-04-24 2015-07-15 成都汇智远景科技有限公司 Storage method for multiple files

Similar Documents

Publication Publication Date Title
US20220044139A1 (en) Search system and corresponding method
Cantador et al. Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011)
Ding et al. The utility of linguistic rules in opinion mining
Noruzi Folksonomies: Why do we need controlled vocabulary?
Bergamaschi et al. Comparing LDA and LSA topic models for content-based movie recommendation systems
Le et al. Unsupervised keyphrase extraction: Introducing new kinds of words to keyphrases
Arguello Aggregated search
US9727617B1 (en) Systems and methods for searching quotes of entities using a database
US20170011643A1 (en) Ranking of segments of learning materials
Fernandez et al. Linking data across universities: an integrated video lectures dataset
Hu et al. Enabling semantic search and knowledge discovery for ArcGIS Online: A linked-data-driven approach
Abdelkader et al. Brands in newsstand: Spatio-temporal browsing of business news
Sateli et al. Semantic user profiles: Learning scholars’ competences by analyzing their publications
KR101478016B1 (en) Apparatus and method for information retrieval based on sentence cluster using term co-occurrence
Bi et al. Iterative relevance feedback for answer passage retrieval with passage-level semantic match
Liu et al. Event-based cross media question answering
Bolettieri et al. Automatic metadata extraction and indexing for reusing e-learning multimedia objects
Juric et al. Discovering links between political debates and media
AlNoamany Using web archives to enrich the live web experience through storytelling
Messina et al. Hyper Media News: a fully automated platform for large scale analysis, production and distribution of multimodal news content
Kurihara et al. Target-topic aware Doc2Vec for short sentence retrieval from user generated content
Mekthanavanh et al. Social web video clustering based on multi-modal and clustering ensemble
Strobel et al. Metadata for scientific audiovisual media: current practices and perspectives of the TIB| AV-Portal
WO2019166858A1 (en) File hosting service in cloud
Pak et al. Normalization of term weighting scheme for sentiment analysis

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18908055

Country of ref document: EP

Kind code of ref document: A1