WO2020250110A1

WO2020250110A1 - A system and a method for generating a 3d visualization using mixed reality-based hmd

Info

Publication number: WO2020250110A1
Application number: PCT/IB2020/055370
Authority: WO
Inventors: Pankaj Raut; Abhijit Patil; Abhishek Tomar
Original assignee: Pankaj Raut; Abhijit Patil; Abhishek Tomar
Priority date: 2019-06-08
Filing date: 2020-06-08
Publication date: 2020-12-17

Abstract

A method (200) for generating a 3D visualization (316) using information from a document (302) in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)) is provided. The method (200) comprises receiving one or more images of one or more pages of a document (302) using an image acquisition device of the HMD (102); processing the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identifying a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts; converting the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models (314); processing and merging the plurality of 3D models (314) to generate a 3D visualization (316) and displaying the generated 3D visualization (316) in a mixed reality space (312) of the HMD (102).

Description

A SYSTEM AND A METHOD FOR GENERATING A 3D VISUALIZATION USING

MIXED REALITY-BASED HMD FIELD OF THE INVENTION

[0001] Embodiment of the present invention generally relates to an artificial intelligence powered mixed reality system and more particularly to a computer system, method and system for generating a 3D visualization from information such as text, 2-D sketches, 3D sketches and voice input, using a Mixed Reality (MR) based Head Mounted Device (HMD).

BACKGROUND OF THE INVENTION

[0002] Illustrations and visualisations have always been considered as one of the most effective form of communication. Illustrations and visualisations are able to convey the most complex information using minimum space and time. The expression and the interpretation of ideas are most conveniently done using visualisation. For example: Children learning about water cycle easily understand the concept when it is explained the help of a diagram, rather than all the text information about processes mentioned in the books. Additionally, it is a proven fact that information studied with the help of illustrations/graphics easily holds reader’s attention and is better understood and remembered. Similarly, most of the times it is easier to visualize ideas and difficult to write them down in words. Humans all over the world have been using such methods to convey their ideas but only a selected section of individuals such as painters, sketch artists, animation professionals etc are capable of presenting such informative visualisations.

[0003] All other people, especially children who have a lot of ideas are unable to express their ideas in an effective manner because not everybody is good at illustrations. Children find it most difficult to express their thoughts and/or understand new things they come across. Furthermore, most of the information transfer such as education (academics), daily news, operation manual of things etc. takes place via written text form (via books, manuals etc.). As previously mentioned, written information is not effective in holding a reader’s attention and is therefore inefficient when it comes to conveying the subject matter of information. This is the why today’s generation prefer studying via YouTube videos instead of text books. Additionally, if the text information is written in as per a particular culture or language but the reader belongs to a different culture, language or is unable to read/write, then for him/her written information is rendered completely useless.

[0004] During this search of an intuitive tool for expressing our thoughts to world, science and technology has taken leap from alphanumeric keyboards, touchscreens to hand gestures and other interfaces. But using even these tools for expressing thoughts require a certain level of expertise, which humans are assumed to learn with experiences towards handling such tools. Existing methods lack a system/method which people from all age groups (even a toddler), cultures, language, literacy level etc. can use to express their thoughts/ideas as well as easily understand given information. Existing solutions fail to leverage vision based artificial intelligence to empower a human being of any age/experience with the ability to share his thoughts just by moving hands, fingers, external controller etc. or understand written information provided in text book, e-books etc. via visual and interactive experiences. Such solutions fail to empower humans with a tool that can seamlessly let them convey ideas in their brain as well as understand available information and eliminate the loss of information while conveying/receiving thoughts from person to another.

[0005] Therefore, there is a need in the art for a computer system, method and system for generating a 3D visualization from information such as text, 2-D sketches, 3D sketches and voice input, using a Mixed Reality (MR) based Head Mounted Device (HMD).

OBJECT OF THE INVENTION

[0006] An object of the present invention is to provide a computer system, method and system for generating a 3d visualization from information such as text, 2-D sketches, 3D sketches and voice input, using a Mixed Reality (MR) based Head Mounted Device (HMD). [0007] Another object of the present invention is to convert written information and 2D sketches into 3D models and visualisations.

[0008] Yet another object of the present invention is to provide a handheld electronic controller to create and interact with the generated 3D sketches, 3D models and 3D visualisations in the mixed reality space.

[0009] Yet another object of the present invention is to enable users to draw 3D sketches in the mixed reality space using hand gestures or the electronic controller and then convert the drawn ketches to 3D models and visualisation.

[0010] Yet another object of the present invention is to enable users to add, remove and design the 3D models using the electronic controller as any design tool such as paint brush, hammer, chisel, axe etc. while providing spatial sound effect and the dynamic haptic feedback depending upon the tool selected and virtual material being worked upon.

SUMMARY OF THE INVENTION

[0011] According to a first aspect of the invention, there is provided a method for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD). The method comprises the steps of receiving one or more images of one or more pages of a document in a chronological order, one by one using an image acquisition device of the HMD; processing the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identifying a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset; converting the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models using a second prestored dataset; processing and merging the plurality of 3-Dimensionsal (3D) models to generate a 3D visualisation/animation depicting the information on the one or more pages; displaying the generated 3D visualisation having a predetermined duration in the chronological order of one or more pages, in a mixed reality space of the HMD, thereby enabling a user to understand the document even without knowing the language of the document. Further, the 3D visualisations are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualisation of a subsequent page continues from a chronology of information from a previous page.

[0012] In accordance with an embodiment of the present invention, the method further comprises sketching & adding new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user in the mixed reality space, using a handheld electronic controller connected with HMD; and editing & modelling with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models in the mixed reality space, using the handheld electronic controller.

[0013] In accordance with an embodiment of the present invention, the document is selected from a group comprising e-books, printed textbooks, handwritten texts, info-graphic books, user manuals, novels, e-files, manuscripts or inscriptions.

[0014] In accordance with an embodiment of the present invention, extracting information from the recognised text includes extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

[0015] In accordance with an embodiment of the present invention, the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

[0016] In accordance with an embodiment of the present invention, the plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

[0017] In accordance with an embodiment of the present invention, the second prestored dataset includes a plurality of 3D models corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

[0018] In accordance with an embodiment of the present invention, processing and merging the plurality of 3-Dimensionsal (3D) models to generate a 3D visualisation comprises a step of adding organic gestures and motion to the generated plurality of 3D models using prestored gestures and motion profiles based on the generated plurality of 3D models and the information extracted from the page.

[0019] According to a second aspect of the invention, there is provided a computer system for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD). The computer system comprises a memory unit configured to store machine-readable instructions; and a processor operably connected with the memory unit. The processor obtains the machine-readable instructions from the memory unit, and is configured by the machine-readable instructions to receive one or more images of one or more pages of a document in a chronological order, one by one using an image acquisition device of the HMD; process the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identify a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset; convert the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models using a second prestored dataset; process and merge the plurality of 3-Dimensionsal (3D) models to generate a 3D visualisation depicting the information on the one or more pages; and display the generated 3D visualisation having a predetermined duration in the chronological order of one or more pages, in a mixed a reality space of the HMD, thereby enabling a user to understand the document even without knowing the language of the book. Further, the 3D visualisations are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualisation of a subsequent page continues from a chronology of information from a previous page.

[0020] In accordance with an embodiment of the present invention, the system further comprises a handheld electronic controller connected with the HMD. The handheld electronic controller is configured to sketch & add new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user, in the mixed reality space; and edit & model with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models in the mixed reality space.

[0021] In accordance with an embodiment of the present invention, the document is selected from a group comprising e-books, printed textbooks, handwritten texts, info-graphic books, user manuals, novels, e-files, manuscripts or inscriptions.

[0022] In accordance with an embodiment of the present invention, the processor is configured to extract information from the recognised text by extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

[0023] In accordance with an embodiment of the present invention, the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

[0024] In accordance with an embodiment of the present invention, the plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

[0025] In accordance with an embodiment of the present invention, the second prestored dataset includes a plurality of 3D models corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

[0026] In accordance with an embodiment of the present invention, the processor is configured to process and merge the plurality of 3-Dimensionsal (3D) models to form the 3D visualisation/animation by adding organic gestures and motion to the generated plurality of 3D models using prestored gestures and motion profiles based on the generated plurality of 3D models and the information extracted from the page

[0027] According to a third aspect of the invention, there is provided a system for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD). The system comprises a control module connected with the HMD, and an interface module. The interface module is configured to receive one or more images of one or more pages of a document in a chronological order, one by one using an image acquisition device of the HMD. Further, the control module is configured to process the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identify a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset; and convert the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models using a second prestored dataset. Additionally, the control module is configured to process and merge the plurality of 3-Dimensionsal (3D) models to generate a 3D visualisation depicting the information on the one or more pages. Furthermore, the interface module is further configured to display the generated 3D visualisation having a predetermined duration in the chronological order of one or more pages, in a mixed a reality space of the HMD. Moreover, the 3D visualisations are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualisation of a subsequent page continues from a chronology of information from a previous page. This enables the user to understand the document even without knowing the language of the document.

[0028] In accordance with an embodiment of the present invention, the system further comprises the handheld electronic controller connected with the HMD. The handheld electronic controller is configured to sketch & adding new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user in the mixed reality space; and edit & model with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models in the mixed reality space.

[0029] In accordance with an embodiment of the present invention, the document is selected from a group comprising, but not limited to, e-books, printed textbooks, hand written texts, info-graphic books, user manuals, novels, e-files, manuscripts or inscriptions.

[0030] In accordance with an embodiment of the present invention, the control module is configured to extract information from the recognised text by extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

[0031] In accordance with an embodiment of the present invention, the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

[0032] In accordance with an embodiment of the present invention, the plurality of objects include all the living and non-living objects selected from a group comprising, but not limited to, humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

[0033] In accordance with an embodiment of the present invention, the second prestored dataset includes a plurality of 3D models corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

[0034] In accordance with an embodiment of the present invention, the control module is configured to process and merge the plurality of 3-Dimensionsal (3D) models by adding organic gestures and motion to the generated plurality of 3D models using prestored gestures and motion profiles based on the generated plurality of 3D models and the information extracted from the page.

[0035] According to a fourth aspect of the system, a handheld electronic controller connected with a Mixed Reality (MR) based Head Mounted Device (HMD) is provided. The handheld electronic controller comprises atleast a magnetic coil, a tactile sensor, a haptic feedback device, a biometric sensor, a microprocessor and a power source. Further, the handheld electronic controller is configured to sketch & add new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user in the mixed reality space; and edit & model with the previously generated plurality of 2D sketches and/or the 3D sketches and/or the plurality of 3D models in the mixed reality space.

[0036] In accordance with an embodiment of the present invention, the handheld electronic controller is configured to be used as a tool in one or more modes based on manual selection and/or a manner of holding the handheld electronic controller. Further, the tool is selected from selection tools, sketching tools such as pencil, pen and paint brush and hardware tools such as hammer, chisel, screwdriver, plier, axe and wrench. Also, the handheld electronic controller is further configured to provide a dynamic haptic feedback using the haptic feedback sensors and spatial sound effect according to a selected virtual tool, a pressure applied by the user and a material being modelled, while interacting in the mixed reality space. In addition, the tactile sensor senses the force exerted by the user to simulate click and drag actions. Moreover, the one or more modes are selected from sketching mode, painting mode, sculpting mode and modelling mode, each mode comprising a customised tool pallet showing different tools required for the respective mode. Furthermore, the biometric sensor is configured to identify and save profiles for multiple users so as to change the one or more modes automatically as preferred by the user.

[0037] In accordance with an embodiment of the present invention, sketching & adding new plurality of 2D sketches and/or 3D sketches further includes copying 2D or 3D images from visible real-world sources such as a physical magazine or an electronic file by creating a boundary to enclose a required 2D or 3D image using the electronic controller; and pasting the selected 2D or 3D image in the mixed reality space of the HMD and converting the 2D or 3D image into a 3D model.

BRIEF DESCRIPTION OF THE DRAWINGS

[0038] So that the manner in which the above recited features of the present invention can be understood in detail, a more particular to the description of the invention, briefly summarized above, may be had by reference to embodiments, some of which are illustrated in the appended drawings. It is to be noted, however, that the appended drawings illustrate only typical embodiments of this invention and are therefore not to be considered limiting of its scope, the invention may admit to other equally effective embodiments.

[0039] These and other features, benefits and advantages of the present invention will become apparent by reference to the following text figure, with like reference numbers referring to like structures across the views, wherein: [0040] Fig. 1 illustrates an exemplary environment of computing devices to which the various embodiments described herein may be implemented, in accordance with an embodiment of the present invention;

[0041] Fig. 2 illustrates a method for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Flead Mounted Device (FIMD), in accordance with an embodiment of the present invention;

[0042] Fig. 3A-3B illustrates an information flow diagram of implementation of the computer system of Fig. 1 and the method of Fig. 2, in accordance with an embodiment of the present invention;

[0043] Fig. 3C-3H illustrates exemplary implementations of the computer system of Fig. 1 and the method of Fig. 2, in accordance with other embodiments of the present invention; and

[0044] Fig. 4 illustrates a system for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD), in accordance with an embodiment of the present invention.

DETAILED DESCRIPTION OF DRAWINGS

[0045] While the present invention is described herein by way of example using embodiments and illustrative drawings, those skilled in the art will recognize that the invention is not limited to the embodiments of drawing or drawings described and are not intended to represent the scale of the various components. Further, some components that may form a part of the invention may not be illustrated in certain figures, for ease of illustration, and such omissions do not limit the embodiments outlined in any way. It should be understood that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents, and alternatives falling within the scope of the present invention as defined by the appended claims. As used throughout this description, the word "may" is used in a permissive sense (i.e. meaning having the potential to), rather than the mandatory sense, (i.e. meaning must). Further, the words "a" or "an" mean "at least one” and the word “plurality” means“one or more” unless otherwise mentioned. Furthermore, the terminology and phraseology used herein is solely used for descriptive purposes and should not be construed as limiting in scope. Language such as "including," "comprising," "having," "containing," or "involving," and variations thereof, is intended to be broad and encompass the subject matter listed thereafter, equivalents, and additional subject matter not recited, and is not intended to exclude other additives, components, integers or steps. Likewise, the term "comprising" is considered synonymous with the terms "including" or "containing" for applicable legal purposes. Any discussion of documents, acts, materials, devices, articles and the like is included in the specification solely for the purpose of providing a context for the present invention. It is not suggested or represented that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention.

[0046] In this disclosure, whenever a composition or an element or a group of elements is preceded with the transitional phrase“comprising”, it is understood that we also contemplate the same composition, element or group of elements with transitional phrases“consisting of”,“consisting”,“selected from the group of consisting of,“including”, or“is” preceding the recitation of the composition, element or group of elements and vice versa.

[0047] The present invention is described hereinafter by various embodiments with reference to the accompanying drawings, wherein reference numerals used in the accompanying drawing correspond to the like elements throughout the description. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiment set forth herein. Rather, the embodiment is provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. In the following detailed description, numeric values and ranges are provided for various aspects of the implementations described. These values and ranges are to be treated as examples only and are not intended to limit the scope of the claims. In addition, a number of materials are identified as suitable for various facets of the implementations. These materials are to be treated as exemplary and are not intended to limit the scope of the invention. [0048] Figure 1 illustrates an exemplary environment of computing devices to which the various embodiments described herein may be implemented, in accordance with an embodiment of the present invention.

[0049] As shown in figure 1 , the environment (100) comprises a Mixed Reality (MR) based Head Mounted Device (HMD) (102). The HMD (102) may be envisaged to include capabilities of generating an augmented reality (AR) environment, mixed reality (MR) environment and a Extended reality (XR) environment in a single device that lets a user interact with digital content within the environment generated in the HMD (102). The HMD (102) is envisaged to be worn by a user and therefore, may be provided with, but not limited to, one or more bands, straps and locks for mounting on the head; or may even be provided as smart glasses (with temples) to be worn just like spectacles. The HMD (102) is envisaged to include components such as, but not limited to, an optical unit having one or more lenses, one or more reflective mirrors & a display unit; a sensing unit having one or more sensors & an image acquisition device; an audio unit comprising one or more speakers and one or more microphones; a user interface; a wireless communication module; and one or more ports. In accordance with an embodiment of the present invention, the optical unit is envisaged to provide a high resolution of 2K per eye and wider field of View. The display unit comprises a Liquid Crystal on Silicon (LCoS) display and a visor.

[0050] In accordance with an embodiment of the present invention, the one or more sensors may selected from, but not limited to, RGB sensor, a depth sensor, an eye tracking sensor, ambient light sensor, an accelerometer, a gyroscope, an Inertial Measurement Unit (IMU) sensor (combination of an accelerometer, a gyroscope and a magnetometer) and EM emitter sensor. The EM emitter sensor can be placed on top of the HMD or separately like a centralized pose tracking hub which can track any EM receiver. Herein, the pose estimation or 6DOF tracking of HMD is done using visual and inertial information from RGB sensor and IMU sensor. Further, the eye tracking sensors may include one or more ultra-compact sensor cubes configured for focal adjustment of the optics, Point of View (POV) rendering with computational optimisation and the Point of Interest (POI) information capture by tracking the retina of the eye.

[0051] Further, the image acquisition device is selected from one or more of, but not limited to, omnidirectional cameras, wide angle stereo vision camera, RGB-D camera, digital cameras, thermal cameras, Infrared cameras and night vision cameras.

[0052] In accordance with an embodiment of the present invention, the one or more microphones in the audio unit is configured to capture binaural audio along the motion of the user and 3D stereo sound with acoustic source localization with the help of IMU. The audio unit may also implement background noise cancellation techniques to further enhance the experience. Furthermore, the one or more speakers may have an audio projection mechanism that projects sound directly to the concha of an ear of the user and reaches an ear canal after multiple reflections.

[0053] In accordance with an embodiment of the present invention, the one or more ports configured to enable wired connection between one or more external devices and the FIMD (102). The one or more ports may be, but not limited to, micro-USB port, USB Type-C ports and FIDMI ports. Further, the wireless communication module configured to establish a wireless communication network (108) to enable wireless communication between the one or more external devices and FIMD (102). In that sense that communication module may include one or more of, but not limited to, a WiFi module, a Bluetooth module, NFC module or a GSM/GPRS module. Therefore, the wireless communication network (108) may be, but not limited to, wireless intranet network, Bluetooth, NFC, WIFI internet or GSM/GPRS based 4G LTE or 5G communication network.

[0054] Further, the environment (100) comprises the computer system (104) connected with the FIMD (102). In accordance with an embodiment of the present invention, the computer system (104) may be encased inside the FIMD (102) itself. The computer system (104) is comprises a memory unit (1044) configured to store machine-readable instructions. The machine-readable instructions may be loaded into the memory unit (1042) from a non-transitory machine-readable medium, such as, but not limited to, CD-ROMs, DVD-ROMs and Flash Drives. Alternately, the machine-readable instructions may be loaded in a form of a computer software program into the memory unit (1042). The memory unit (1042) in that manner may be selected from a group comprising EPROM, EEPROM and Flash memory. Further, the computer system (104) includes a processor (1044) operably connected with the memory unit (1042). In various embodiments, the processor (1044) may be a multipurpose, clock driven, register based, digital integrated circuit that accepts binary data as input, processes it according to instructions stored in its memory and provides results as output. In one embodiment, the processor (1044) may be, but not limited to, a microprocessor. The microprocessor may contain both combinational logic and sequential digital logic.

[0055] The computer system (104) may further implement artificial intelligence and machine learning based technologies for, but not limited to, data analysis, collating data and presentation of data in real-time.

[0056] In accordance with an embodiment of the present invention, a data repository (not shown) is also connected with the computer system (104). The data repository may be, but not limited to, a local or a cloud-based storage, configured to store a first prestored dataset, second prestored dataset and gestures & motion profiles. Flerein, the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects. The plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment. Further, the second prestored dataset includes a plurality of 3D models corresponding to a respective plurality of 2D sketches, the extracted information and the 3D sketches of the above- mentioned plurality of objects.

[0057] Additionally, the gestures & motion profiles include various common gestures and motions of the above-mentioned plurality of objects. For example: it may include gestures and motions of a person (both male/female) performing day-to-day activities like walking, running, reading, eating etc., complex activities like dancing and combat animations, river flowing, movements of animals, working of vehicles, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment etc. In one embodiment, the data repository also stores spatial sounds associated with the plurality of objects. For example: sounds of various tools such as hammer blows, pencil strokes, flowing water bodies, sounds of vehicles, sounds of animals etc. may be prestored in the data repository, which may be provided to the computer system (104) when queried with appropriate protocols.

[0058] In one embodiment, the HMD (102) is envisaged to include a battery unit. The battery unit is configured to power the components of the HMD (102) and the computer system (104). In that sense, the battery unit may be, but not limited to, a Li-ion detachable and rechargeable battery. The battery may be detached, recharged and then re-attached.

[0059] In accordance with an embodiment of the present invention, the environment further comprise a handheld electronic controller (106) connected with the HMD (102) and computer system (104) via the communication network (108). In one embodiment, the electronic handheld controller may also be connected via one or more ports of the HMD (102). The handheld electronic controller (106) comprises a magnetic coil, a microprocessor, a tactile sensor, a haptic feedback mechanism, a biometric senor and a power source. The magnetic coil acts as receiver for the magnetic field emitted by the electromagnetic emitter sensors of the HMD (102). The microprocessor may be a multipurpose, clock driven, register based, digital integrated circuit that accepts binary data as input, processes it according to instructions stored in its memory and provides results as output. The microprocessor is used for receiving and processing magnetic field and thereby enabling movement tracking of the handheld electronic controller (106) . However, it will be appreciated by a skilled addressee that other than electromagnetic tracking, techniques such as optical/visual tracking, visual-inertial tracking may be also be used without departing from the scope of the present invention. Further, the power source may be, but not limited to, a Li-ion detachable and rechargeable battery configured to power to the magnetic coil and the microprocessor.

[0060] In accordance with an embodiment of the present invention, the handheld electronic controller (106) may be enclosed in a housing made of, but not limited to, plastic or metal and shaped as, but not limited to, gamepad, joystick, wand or any customised shape. Preferable, the handheld electronic controller (106) has a custom shape having a thicker upper portion with one or more pointed ends and thinner lower portion that is to be held by the user. The custom shape is such that it may be held like multiple tools such as a pen/pencil, paint brush or like hardware tools such as hammer, chisel, axe etc. In some embodiments, the organic and ergonomic design of controller to resemble stone tools of early men which were used like a swiss knife for multiple tasks. In another embodiment, the microprocessor identifies the tool which the user wants to use the handheld electronic controller (106) as, based on the way it is held by the user.

[0061] The biometric sensor enables the electronic controller to identify and save profiles for different users so that it can change the mode automatically as preferred by the user. Further, the tactile sensor senses the force exerted by the user to simulate click and drag actions. Additionally, the haptic feedback mechanism may comprise one or more motors having unbalanced weights to create vibrations; or one or more sliding plates in the surface of the thinner lower portion to be held, which mimic the shear and friction forces one would feel if they were interacting with real objects. The haptic feedback mechanism provides the feel of the using a tool based on the tool being used and the material being worked upon, thereby giving the sensation of friction and material-tool interaction. Additionally, spatial sound effects of the tool being used are also provided via the audio unit of the HMD (102). For example, a user would actually feel the impact on the hand and banging sound in the ears if he/she is using the handheld electronic controller (106) as a hammer to shape a metallic object in the mixed reality space.

[0062] In addition, the handheld electronic controller (106) may also comprises an interface. The interface may include one or more of, but not limited to, physical buttons, trackpad, joystick and touch-based buttons on the surface of the housing to enable a user to interact with objects in the mixed reality space. In one embodiment, there is no interface provided on the housing, and the user may simply touch the plurality of 3D models and virtual pallets displayed in the mixed reality space using the handheld electronic controller (106). [0063] Figure 2 illustrates a method (200) for generating a 3D visualization using information from a document in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)), in accordance with an embodiment of the present invention. Herein, the 3D visualisation may be understood as an animation of virtual 3D objects moving in mixed reality space. Additionally, the document is selected from a group comprising, but not limited to, e-books, printed textbooks, hand written texts, info-graphic books, user manuals, novels, e-files, manuscripts and inscriptions. Therefore, the term “document” is envisaged to not only cover its literal meaning but it is meant to cover books/magazines/manuals as well as any written, drawn, presented, or memorialized representation of thought.

[0064] The method (200) starts at step 210 by receiving one or more images of one or more pages of a document in a chronological order, one by one using the image acquisition device of the HMD (102). The image acquisition device can be understood as various types of cameras provided on the HMD (102) as previously mentioned in specification. The one or more images may be captured by the image acquisition device and then received by the processor (1044). The same has been illustrated in figure 3A. In the exemplary implementation shown in figure 3A, the document (302) is envisaged to be a story book, that has two pages (3022) open at the same time. So, the processor (1044) receives two images of the pages (3022) in the chronological order.

[0065] Then at step 220, the one or more images are processed for text recognition and extracting information from the recognised text in a form of a plurality of excerpts. Various techniques may be used for text recognition and information extraction such as, but not limited to, Optical Character Recognition (OCR), polygon-based character recognition and any Artificial Intelligence (Al) based techniques. It will be appreciated by a skilled addressee that the information presented in any or multiple languages unknown to the user (322), is also be recognised and extracted by the processor (1044). The information extracted may be divided in the form of a plurality of excerpts and stored as machine interpretable array. The “excerpts” may be understood as a one paragraph or a group of multiple paragraphs. Continuing from the example shown in figure 3A, it is assumed that the story book tells about a story of a fisherman. So, the information extracted from the first two pages (3022) is divided into multiple excerpts, for example, first excerpt talks about his personal life, appearance & residence. The next excerpts talk about one of his fishing trips. Similarly, further excerpts may talk about how he does his business or any other information. So, all this extracted information is stored in the one continuous machine-interpretable array so that the story moves in continuation and does not restart at every page or paragraph.

[0066] After that, at step 230, the processor (1044) identifies a plurality of 2- Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset. It is a common practice around the world that most books and informative publications use 2D images/sketches along with the text to convey information. The processor (1044) herein identifies the 2D sketches and also the excerpt with which it is associated based on extracted information. The 2D sketches may be identified from the text by comparing the captured one or more images with the first pre-stored datasets. The first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects. The plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

[0067] Continuing from the previous example, it can be seen that the story book contains a sketch of a man fishing on the first page. So, the processor (1044) compares the captured 2D image with the plurality of objects to identify the object in the image/sketch as a man, river, water body, net, fishing gear etc. Additionally, the processor (1044) recognises that this image is associated with the second excerpt that talked about the fishing trip of the fisherman. In this manner, the plurality of 2D sketches are successfully linked with the text information. In one embodiment, the plurality of 2D sketches may be generated by the processor (1044) by taking ques (such as fishing trip, man, river, water body, net, fishing gear etc.) from the text information only and comparing with the plurality of objects in first prestored dataset. Accordingly, when the objects are identified, the corresponding 2D sketch is generated. In that case, techniques such as visual question answering may be used by the processor (1044).

[0068] Onwards, at step 240, the processor (1044) converts the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models (314) using a second prestored dataset. The second prestored data the second prestored dataset includes a plurality of 3D models (314) corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches. Herein, the plurality of 2D sketches that are identified or generated in the previous step are compared with prestored 3D models of the corresponding 2D sketches and accordingly the corresponding plurality of 3D models (314) are obtained. The same has been illustrated in figure 3B. In an embodiment, where there are no 2D sketches and only text information, the plurality of 3D models (314) may be generated by the processor (1044) by taking ques (such as fishing trip, man, river, water body, net, fishing gear etc.) from the text information only and comparing with 3D models corresponding to the plurality of objects in second prestored dataset. Accordingly, when the plurality of objects are identified, the corresponding plurality of 3D models (314) is generated. In that case, techniques such as visual question answering may be used by the processor (1044).

[0069] Additionally, at step 250, the plurality of 3-Dimensionsal (3D) models are processed and merged by the processor (1044) to generate a 3D visualization (316) depicting the information on the one or more pages. At this step, the processor (1044) adds organic gestures and motion to the generated plurality of 3D models (314) using the prestored gestures and motion profiles in the data repository. As previously mentioned, the prestored gestures and motion profiles are based on the generated plurality of 3D models (314). Further, the processor (1044) also takes visual ques from the information extracted. Continuing the previous example, the processor (1044) may apply motions and gestures performed during fishing activity to the generated plurality of 3D models (314) to generate the 3D visualization (316) or the moving animation. The 3D visualization (316) may have a predetermined frame rate and a duration. [0070] Also, at step 260, processor (1044) displays the generated 3D visualization (316) having the predetermined framerate and duration in the chronological order of one or more pages, in a mixed reality space (312) of the HMD (102). The same has been illustrated in figure 3B, where 3D visualization (316) of a man fishing, has been illustrated. It is to be noted here that the 3D shown in figure 3B is just an example and there may more than one 3D visualizations (316) of a single page depending upon the number of excerpts that information has been extracted in. As previously mentioned, the plurality of excerpts are stored as the machine-interpretable array, so each new 3D visualization continue from the previous 3D visualization (316), like a story or movie or chapter in a continuous flow of information. In the above example, there may be have been 3D visualization before the 3D visualisation (316) showing the same fisherman in his home getting ready for fishing. Further, there may be a 3D visualization that would appear after the 3D visualization (316) displayed in figure 3B, showing the fisherman cooking or selling his fishes in the market. Similarly, the 3D visualizations (316) are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualization (316) of a subsequent page continues from a chronology of information from a previous page. So, when the user (322) turns the page of the document (302), the story or information flow continues from the previous page. This enables the user (322) to understand the document (302) even without knowing the language of the document (302) or even without having to read the document (302).

[0071] It will be appreciated by a skilled addressee that the above-mentioned steps do not follow any strict order and also that the steps take place in real time, much faster than the time that would been required for the user (322) to read the whole document (302). Furthermore, in one embodiment, all the generated 3D visualizations (316) for a particular document (302) are saved in the data repository for future reference. So, in case if some other user (322) or the same user (322) wants to read that document (302), the 3D visualizations (316) already generated would be accessed and displayed in the mixed reality space (312). The user (322) may also access the 3D visualizations (316) of the documents previously generated and stored in data repository without requiring the document (302) itself and without having to go through the steps of method (200) again. In another embodiment, the 3D visualizations (316) may also be shared with other HMDs.

[0072] Figures 3C-3H illustrates other exemplary embodiments and implementations of the present invention:

[0073] Figure 3C illustrates a scenario where the user (322) is assumed to be a screenplay writer, director or any other person associated with movie business, reading the document (302) which is a script of a movie to be made. As shown in the figure 3C, the user (322) is able to see the 3D visualization (316) of the pages of the script that he is reading. In the example shown, it is assumed that the user (322) is reading a page about the introduction of a character, wherein the character is mentioned to be rich man having a big house, who lives a disciplined life and likes to workout. So, using this information as well as other cues from the script, the processor (1044) generates and displays the 3D visualization (316) comprising of a plurality of 3D models (314) such as a big house, a man etc. and the character is shown jogging around the house, in the mixed reality space (312). Now the user (322) finds that there should be a luxury car around the house to properly reflect the personality and lifestyle of the character. So, the user (322) may add a new 3D model of a luxury car from the data repository.

[0074] In case, the user (322) does not like the stored plurality of 3D models of the luxury car, he may simply draw a 2D sketch on a paper or a 3D sketch in the air using the handheld electronic controller (106) or hand gestures (bare hands) in the mixed reality space (312). Figure 3D illustrates the user (322) making the car in the mixed reality space (312). As shown in figure 3D, multiple tool pallets (344) are available for the user (322) to utilise. These air sketches and models can easily be made without requiring any expertise in sketching/drawing, by any common person. This 3D sketch may then be converted to a new 3D model (342) by following the steps 220-250. The generated 3D model (342) may then be added to the previously generated 3D visualization (316). The same has been illustrated in figure 3E. Additionally, if the user (322) finds the luxury car he needs in the 3D visualization (316) on some electronic or a physical paper/book, the user (322) may easily user (322) the handheld electronic controller (106) to mark a boundary of the luxury car (or any object for that matter) printed in the book. The processor (1044) detects and identifies the marked area and the object enclosed therein would create a 3D model of the same. The generated 3D model may then be added in the 3D visualization (316) (similar to copy-paste functionality).

[0075] Furthermore, in case the user (322) does not like features of any of the plurality of 3D models in the 3D visualisation (316), the user (322) may easily select the desired object and edit the selected object in one or more modes available in the HMD (102). The one or more modes are selected from, but not limited to, sketching mode, painting mode, sculpting mode and modelling mode and each mode comprises a customised tool pallets (344) showing different tools required for the respective mode. The user (322) may use the handheld electronic controller (106) as a tool in the mixed reality space (312) for adding, removing, editing or modelling any 3D model or sketch in any of the above- mentioned modes. The tool is selected from, but not limited to, selection tools, sketching tools such as pencil, pen and paint brush and hardware tools such as hammer, chisel, screwdriver, plier, axe and wrench. Furthermore, the handheld electronic controller (106) is configured to provide a dynamic haptic feedback and spatial sound effect according to a selected virtual tool, a pressure applied by the user (322) and a material being modelled, while interacting in the mixed reality space (312).

[0076] After the editing and modelling, the user (322) may add the edited 3D object (342) to the 3D visualization (316) again. Referring back to figure 3E, now the luxury car may be seen parked outside the house while the character jogs in the updated 3D visualization (318). In this manner, the present invention may be used for assisting in writing screenplay for movie plot/story by providing 3D visualizations (316) of how the scenes can be set up or how would the scenes appear (if screenplay is being read using the HMD (102)). Additionally, these 3D visualizations (316, 318) may be stored and shown to the actors, producers etc. while narration of the script for better understanding.

[0077] Figure 3F illustrates an exemplary embodiment, it is assumed that the user (322) is reading a manual of car engine and the 3D visualization (316) comprising a plurality of 3D models (314) is generated and displayed in the mixed reality space (312). The user (322) may interact with the plurality of 3D models (314) and observe their operations in the mixed reality space using the handheld electronic controller (106) or hand gestures. In this manner the present invention may help common users as well as engineering students to have a better understanding of their vehicle or field of study.

[0078] Figure 3G illustrates a scenario where the user (322) may be a student studying workshop technology, reading a topic of making wooden T -joints in the document (i.e. academic book). As shown in figure 3G, the user (322) may not only see the 3D visualizations (316) in the mixed reality space (312) for better understanding, but also get a life-like hands-on experience of modelling the plurality of 3D objects (342). The user (322) is illustrated to be using the “modelling mode” in the HMD (102) and using the handheld electronic controller (106) as a chisel. Additionally, the handheld electronic controller (106) is further configured to provide a dynamic haptic feedback and spatial sound effect according to a selected virtual tool, a pressure applied by the user (322) and a material being modelled, while interacting in the mixed reality space (312). In the example shown, the user (322) would actually feel the vibration while chiselling the wood or the impact while using the handheld electronic controller (106) as the hammer. The audio unit of the HMD (102) also provides the spatial sound effect of the hammer hitting the wood to provide a more realistic feel and an immersive experience. In one embodiment, the processor (1044) and the HMD (102) automatically detect the tool to be used by the user (322) based on the manner in which it is held. In another embodiment, the processor (1044) displays the tools required by the user (322) in the selected mode on a virtual belt worn by the user (322), just like the manner in which a professional works. The user (322) may reach belt in the mixed reality space (312) to access the required tool.

[0079] Figure 3G illustrates an example, where the user (322) assumed to be a child reading his/her sketch/colouring book. In this scenario, the processor (1044) generates the 3D visualisation of pages of the book & also enables the user (322) to do air sketching and colouring of the plurality of 3D models (342) in the mixed reality space (312). As can be seen from the figure 3G, the 3D visualization (316) of underwater sea is shown and the user (322) is air sketching using the hand gestures (3222) in 3D mixed reality space (312). Further, the 3D sketching with hand gestures (3222) and handheld electronic controller (106), may also be accompanied with voice inputs from the user (322). The user (322) may describe what he/she wants to add to the 3D sketch and using speech recognition and interpretation capabilities the processor (1044) may add those details and make relevant suggestions for the user (322) to accept and add to the 3D sketch. In the present example, the user (322) may easily speak about adding particular colours to fish’s body parts while sketching the fish. After this, the 3D sketch may then be converted to a 3D model and accordingly to the 3D visualization (316). In one embodiment of the present invention, the user (322) may easily access the one or more modes in the HMD (102) for sketching, sculpting, modelling etc. at all times without the requirement of a document (302).

[0080] According to another aspect of the invention, there is provided a system (400) for generating a 3D visualization (316) using information from a document (302) in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)). The system comprises a control module (402) connected with the HMD (102) and an interface module (404). The interface module (404) is configured to receive one or more images of one or more pages of a document (302) in a chronological order, one by one using an image acquisition device of the HMD (102). Further, the control module (402) is configured to process the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identify a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset; and convert the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models (314) using a second prestored dataset. Additionally, the control module (402) is further configured to process and merge the plurality of 3-Dimensionsal (3D) models to generate a 3D visualization (316) depicting the information on the one or more pages. Furthermore, the interface module (404) is further configured to display the generated 3D visualization (316) having a predetermined duration in the chronological order of one or more pages, in a mixed a reality space of the HMD (102). Moreover, the 3D visualizations (316) are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualization (316) of a subsequent page continues from a chronology of information from a previous page. This enables the user (322) to understand the document (302) even without knowing the language of the document (302).

[0081] In accordance with an embodiment of the present invention, the system (400) further comprises the handheld electronic controller (106) connected with the HMD (102) via the communication network (108). In one embodiment, the electronic handheld controller may also be connected via one or more ports of the HMD (102). The handheld electronic controller (106) is configured to sketch & adding new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user (322) in the mixed reality space (312); and edit & model with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models (314) in the mixed reality space (312).

[0082] In accordance with an embodiment of the present invention, the document (302) is selected from a group comprising, but not limited to, e-books, printed text books, hand written texts, info-graphic books, user (322) manuals, novels, e-files, manuscripts or inscriptions.

[0083] In accordance with an embodiment of the present invention, the control module (402) is configured to extract information from the recognised text by extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

[0084] In accordance with an embodiment of the present invention, the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

[0085] In accordance with an embodiment of the present invention, the plurality of objects include all the living and non-living objects selected from a group comprising, but not limited to, humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

[0086] In accordance with an embodiment of the present invention, the second prestored dataset includes a plurality of 3D models (314) corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

[0087] In accordance with an embodiment of the present invention, the control module (402) is configured to process and merge the plurality of 3- Dimensionsal (3D) models by adding organic gestures and motion to the generated plurality of 3D models (314) using prestored gestures and motion profiles based on the generated plurality of 3D models (314) and the information extracted from the page.

[0088] The present invention offers a number of advantages. Firstly, the present invention provides a simple, cost-effective and easy to use solution for the problems of prior art. Further, the present invention eliminates the barriers like language difference, cultural difference and even literacy, from transfer of knowledge and information. Even a person who does not know how to read and write can now understand written information in the form of 3D visualisations. Similarly, information I a foreign language can be understood in the same way. Furthermore, the present invention enables people of all age group to express their own thoughts and ideas as well as understand other’s thought process using the 3D visualisations generated in the mixed reality space. People can now express their ideas using 3D sketching in mixed reality space which may be converted to 3D models and visualisations for effective transfer of information.

[0089] Similarly, the people especially students and professionals who find it difficult or boring to study their books, can now easily turn them into interactive 3D visualisations and enjoy the experience of study. Additionally, the present invention finds numerous applications in digital content and movie industry (by assisting in screenplay, narration etc.) and also for students in schools and colleges, industry professionals, artists, readers etc. The present invention provides an immersive, interactive and hands-on experience of multiple tasks while providing the solution for the problems of prior art. [0090] Further, one would appreciate that the wireless communication network used in the system can be a short-range communication network and/or a long- range communication network, wire or wireless communication network. The communication interface includes, but not limited to, a serial communication interface, a parallel communication interface or a combination thereof.

[0091] In general, the word“module,” as used herein, refers to logic embodied in hardware or firmware, or to a collection of software instructions, written in a programming language, such as, for example, Java, C, Python or assembly. One or more software instructions in the modules may be embedded in firmware, such as an EPROM. It will be appreciated that modules may comprised connected logic units, such as gates and flip-flops, and may comprise programmable units, such as programmable gate arrays or processors. The modules described herein may be implemented as either software and/or hardware modules and may be stored in any type of computer- readable medium or other computer storage device.

[0092] Further, while one or more operations have been described as being performed by or otherwise related to certain modules, devices or entities, the operations may be performed by or otherwise related to any module, device or entity. As such, any function or operation that has been described as being performed by a module could alternatively be performed by a different server, by the cloud computing platform, or a combination thereof. It should be understood that the techniques of the present disclosure might be implemented using a variety of technologies. For example, the methods described herein may be implemented by a series of computer executable instructions residing on a suitable computer readable medium. Suitable computer readable media may include volatile (e.g. RAM) and/or non-volatile (e.g. ROM, disk) memory, carrier waves and transmission media. Exemplary carrier waves may take the form of electrical, electromagnetic or optical signals conveying digital data steams along a local network or a publicly accessible network such as the Internet.

[0093] It should also be understood that, unless specifically stated otherwise as apparent from the following discussion, it is appreciated that throughout the description, discussions utilizing terms such as "controlling" or "obtaining" or "computing" or "storing" or "receiving" or "determining" or the like, refer to the action and processes of a computer system, or similar electronic computing device, that processes and transforms data represented as physical (electronic) quantities within the computer system's registers and memories into other data similarly represented as physical quantities within the computer system memories or registers or other such information storage, transmission or display devices.

[0094] Various modifications to these embodiments are apparent to those skilled in the art from the description and the accompanying drawings. The principles associated with the various embodiments described herein may be applied to other embodiments. Therefore, the description is not intended to be limited to the embodiments shown along with the accompanying drawings but is to be providing broadest scope of consistent with the principles and the novel and inventive features disclosed or suggested herein. Accordingly, the invention is anticipated to hold on to all other such alternatives, modifications, and variations that fall within the scope of the present invention and the appended claims.

Claims

1. A method (200) for generating a 3D visualization (316) using information from a document (302) in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)), the method (200) comprising the steps of:

receiving one or more images of one or more pages of a document (302) in a chronological order, one by one using an image acquisition device of the HMD (102);

processing the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts;

identifying a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset;

converting the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models (314) using a second prestored dataset; processing and merging the plurality of 3-Dimensionsal (3D) models to generate a 3D visualization (316) depicting the information on the one or more pages; and

displaying the generated 3D visualization (316) having a predetermined duration in the chronological order of one or more pages, in a mixed reality space (312) of the HMD (102), thereby enabling a user (322) to understand the document (302) even without knowing the language of the document (302);

wherein the 3D visualizations (316) are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualization (316) of a subsequent page continues from a chronology of information from a previous page.

2. The method (200) as claimed in claims 1 , further comprising:

sketching & adding new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user (322) in the mixed reality space (312), using a handheld electronic controller (106) connected with HMD (102); editing & modelling with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models (314) in the mixed reality space (312), using the handheld electronic controller (106).

3. The method (200) as claimed in claims 1 , wherein the document (302) is selected from a group comprising e-books, printed text books, hand written texts, info-graphic books, user (322) manuals, novels, e-files, manuscripts or inscriptions.

4. The method (200) as claimed in claims 1 , wherein extracting information from the recognised text includes extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

5. The method (200) as claimed in claims 4, wherein the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

6. The method (200) as claimed in claims 5, wherein the plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

7. The method (200) as claimed in claims 6, wherein the second prestored dataset includes a plurality of 3D models (314) corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

8. The method (200) as claimed in claims 1 , wherein processing and merging the plurality of 3-Dimensionsal (3D) models to generate a 3D visualization (316) comprises a step of adding organic gestures and motion to the generated plurality of 3D models (314) using prestored gestures and motion profiles based on the generated plurality of 3D models (314) and the information extracted from the page.

9. A computer system (104) for generating a 3D visualization (316) using information from a document (302) in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)), the computer system (104) comprising:

a memory unit (1042) configured to store machine-readable instructions; and

a processor (1044) operably connected with the memory unit (1042), the processor (1044) obtaining the machine-readable instructions from the memory unit (1042), and being configured by the machine-readable instructions to:

receive one or more images of one or more pages of a document (302) in a chronological order, one by one using an image acquisition device of the HMD (102);

process the one or more images for text recognition and extracting information from the recognised text in a form of a plurality of excerpts; identify a plurality of 2-Dimensional (2D) sketches for the corresponding plurality of excerpts in the chronological order of the one or more pages using a first prestored dataset;

convert the plurality of 2D sketches and the extracted information into a corresponding plurality of 3D models (314) using a second prestored dataset;

process and merge the plurality of 3-Dimensionsal (3D) models to generate a 3D visualization (316) depicting the information on the one or more pages; and

display the generated 3D visualization (316) having a predetermined duration in the chronological order of one or more pages, in a mixed a reality space of the HMD (102), thereby enabling a user (322) to understand the document (302) even without knowing the language of the book;

10. The computer system (104) as claimed in claims 9, further comprising a handheld electronic controller (106) connected with the HMD (102); wherein the handheld electronic controller (106) is configured to: sketch & add new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user (322), in the mixed reality space (312); and edit & model with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models (314) in the mixed reality space (312).

1 1. The computer system (104) as claimed in claims 9, wherein the document (302) is selected from a group comprising e-books, printed text books, hand written texts, info-graphic books, user (322) manuals, novels, e-files, manuscripts or inscriptions.

12. The computer system (104) as claimed in claims 9, wherein the processor (1044) is configured to extract information from the recognised text by extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

13. The computer system (104) as claimed in claims 12, wherein the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

14. The computer system (104) as claimed in claims 13, wherein the plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

15. The computer system (104) as claimed in claims 14, wherein the second prestored dataset includes a plurality of 3D models (314) corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

16. The computer system (104) as claimed in claims 9, wherein the processor (1044) is configured to process and merge the plurality of 3-Dimensionsal (3D) models by adding organic gestures and motion to the generated plurality of 3D models (314) using prestored gestures and motion profiles based on the generated plurality of 3D models (314) and the information extracted from the page

17. A system (400) for generating a 3D visualization (316) using information from a document (302) in a chronological order of pages, using a Mixed Reality (MR) based Head Mounted Device (HMD (102)), the system (400) comprising:

a control module (402) connected with the HMD (102); and

an interface module (404);

wherein the interface module (404) is configured to receive one or more images of one or more pages of a document (302) in a chronological order, one by one using an image acquisition device of the HMD (102);

wherein the control module (402) is configured to:

process and merge the plurality of 3-Dimensionsal (3D) models to generate a 3D visualization (316) depicting the information on the one or more pages;

wherein the interface module (404) is further configured to display the generated 3D visualization (316) having a predetermined duration in the chronological order of one or more pages, in a mixed a reality space of the HMD (102), thereby enabling a user (322) to understand the document (302) even without knowing the language of the document (302); and wherein the 3D visualizations (316) are generated page by page, starting from a first page of the one or more pages in a chronological order and the 3D visualization (316) of a subsequent page continues from a chronology of information from a previous page.

18. The system (400) as claimed in claims 17, further comprising a handheld electronic controller (106) connected with the HMD (102);

wherein the handheld electronic controller (106) is configured to: sketch & adding new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user (322) in the mixed reality space (312); edit & model with the previously generated plurality of 2D sketches and/or 3D sketches and/or the plurality of 3D models (314) in the mixed reality space (312).

19. The system (400) as claimed in claims 17, wherein the document (302) is selected from a group comprising e-books, printed text books, hand written texts, info-graphic books, user (322) manuals, novels, e-files, manuscripts or inscriptions.

20. The system (400) as claimed in claims 17, wherein the control module (402) is configured to extract information from the recognised text by extracting terms indicative of a corresponding plurality of objects and segmenting the recognised text into the plurality of excerpts based on the context in which the terms are being used.

21. The system (400) as claimed in claims 20, wherein the first prestored dataset includes a plurality of 2D sketches corresponding to a respective plurality of objects.

22. The system (400) as claimed in claims 21 , wherein the plurality of objects include all the living and non-living objects selected from a group comprising humans of multiple age groups, animals, plants, furniture, vehicles, natural resources, eatables, crops, infrastructure, stationery, sign boards, wearables, musical instruments, sports equipment, mechanical tools, electrical equipment and electronic equipment.

23. The system (400) as claimed in claims 22, wherein the second prestored dataset includes a plurality of 3D models (314) corresponding to a respective plurality of 2D sketches, the plurality of objects and the 3D sketches.

24. The system (400) as claimed in claims 17, wherein the control module (402) is configured to process and merge the plurality of 3-Dimensionsal (3D) models by adding organic gestures and motion to the generated plurality of 3D models (314) using prestored gestures and motion profiles based on the generated plurality of 3D models (314) and the information extracted from the page.

25. A handheld electronic controller (106) connected with a Mixed Reality (MR) based Head Mounted Device (HMD (102)) comprising atleast a magnetic coil, a tactile sensor, haptic feedback device, a biometric sensor, a microprocessor and a power source;

wherein the handheld electronic controller (106) is configured to: sketch & add new plurality of 2D sketches and/or 3D sketches based on gestures and/or voice inputs from the user (322) in the mixed reality space (312); edit & model with the previously generated plurality of 2D sketches and/or the 3D sketches and/or the plurality of 3D models (314) in the mixed reality space (312).

26. The handheld electronic controller (106) as claimed in claim 25, wherein the handheld electronic controller (106) is configured to be used as a tool in one or more modes based on manual selection and/or a manner of holding the handheld electronic controller (106);

wherein the tool is selected from selection tools, sketching tools such as pencil, pen and paint brush and hardware tools such as hammer, chisel, screw driver, plier, axe and wrench;

wherein the handheld electronic controller (106) is further configured to provide a dynamic haptic feedback using the haptic feedback sensors and spatial sound effect according to a selected virtual tool, a pressure applied by the user (322) and a material being modelled, while interacting in the mixed reality space (312); wherein the tactile sensor senses the force exerted by the user to simulate click and drag actions; wherein the one or more modes are selected from sketching mode, painting mode, sculpting mode and modelling mode, each mode comprising a customised tool pallets (344) showing different tools required for the respective mode; and wherein the biometric sensor is configured to identify and save profiles for multiple users so as to change the one or more modes automatically as preferred by the user.

27. The handheld electronic controller (106) as claimed in claim 25, wherein sketching & adding new plurality of 2D sketches and/or 3D sketches further includes:

copying 2D or 3D images from visible real-world sources such as a physical magazine or an electronic file by creating a boundary to enclose a required 2D or 3D image using the electronic controller (106);

pasting the selected 2D or 3D image in the mixed reality space (312) of the HMD (102) and converting the 2D or 3D image into a 3D model.