CN116457774A - Computer-implemented method for automatically analyzing or using data - Google Patents

Computer-implemented method for automatically analyzing or using data Download PDF

Info

Publication number
CN116457774A
CN116457774A CN202180072454.8A CN202180072454A CN116457774A CN 116457774 A CN116457774 A CN 116457774A CN 202180072454 A CN202180072454 A CN 202180072454A CN 116457774 A CN116457774 A CN 116457774A
Authority
CN
China
Prior art keywords
machine
paragraph
representation
semantic
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180072454.8A
Other languages
Chinese (zh)
Inventor
威廉·通斯托尔-佩多
芬莱·柯伦
哈利·罗斯科
罗伯特·海伍德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aolaikelei Artificial Intelligence Co ltd
Original Assignee
Aolaikelei Artificial Intelligence Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aolaikelei Artificial Intelligence Co ltd filed Critical Aolaikelei Artificial Intelligence Co ltd
Priority claimed from PCT/GB2021/052196 external-priority patent/WO2022043675A2/en
Publication of CN116457774A publication Critical patent/CN116457774A/en
Pending legal-status Critical Current

Links

Abstract

Computer-implemented methods for automatically analyzing or using data are implemented by a voice assistant. The method comprises the following steps: (a) Storing in memory a structured machine-readable representation ("machine representation") of data conforming to a machine-readable language; the machine representation includes: a representation of user speech or text input for a human-machine interface; and (b) automatically processing the machine representation to analyze the user speech or text input.

Description

Computer-implemented method for automatically analyzing or using data
Technical Field
The field of the invention relates to a computer-implemented method for automatically analyzing or using data; one embodiment is a voice assistant that is capable of analyzing, interpreting, and acting on natural language spoken and text input.
Background
Natural Language (NL) is a language that evolved for humans, such as english. Despite significant advances in the ability of computers to process natural language, computers still have no way of deeply understanding the meaning of natural language and using it internally.
For this reason, most computer applications typically use structured data to store information that they need to process—e.g., relational databases: design, populate a database, and write code to process fields in the database. The use of structured data works well if the application has limited requirements for the type of data required. However, some applications naturally require extremely extensive, heterogeneous data sets to work well. This means that the required solution will have to be bulky, making it impractical to build and code such applications. We refer to such applications herein as HUB applications (heterogeneous and unreasonably extensive).
Examples of HUB applications include applications for managing general health data of a person, where thousands of tests, thousands of medical conditions, and thousands of symptoms exist. Another related application may be a nutrition tracking application where there are thousands of substances and foods that can be ingested, each with a different metabolic impact on the body.
Another example is an application that matches a potential candidate's resume with a job specification: in principle, such applications would require structured data to represent each skill that may be valuable to any role, each type of experience, each type of previous work.
Accounting is another application where large amounts of heterogeneous data would be valuable: perfect accounting applications would represent each type of contract, each type of service.
In practice, some of these applications (if any) operate with limited schemes that do not cover the full range of their ideal characteristics. For example, health applications typically work like this, ignoring many types of data that they do not cover, but eventually becoming narrow-limiting the application to only certain vertical lines within a health condition.
Applications may also use natural language or extend limited schemes with natural language, such as current resume matching applications, which may represent some key skills in a structured form, but otherwise rely heavily on keyword searching for written resumes or statistical Natural Language Processing (NLP) techniques.
In the case of accounting, transactions are represented by limited structured data—borrowers and lenders on virtual ledgers with natural language names. The meaning of natural language names and thus the content of these transaction representations is typically opaque to the application. Virtual ledgers typically combine different types of transactions together, but fail to represent semantic differences that may be important.
There is no exact threshold for when an application becomes a HUB application, but the difficulty of building an application using a manually created scheme does not only grow linearly with the increasing number of tables, as it becomes more and more difficult to manage these tables and maintain their code.
These problems can be solved if there is a language or means by which a computer can fully process and understand, but also has an extremely broad range of presentation data.
In conventional Artificial Intelligence (AI), statistical Machine Learning (ML), particularly Deep Learning (DL), has been widely used. This provides a significant advance in many problems. Despite advances, results cannot be interpreted in a meaningful way to a human user because solutions are computational results that may involve billions of weights. It can also be said that the system lacks a "real-life" understanding of the data, or at least that such an understanding is very different from the way a human user would understand it.
Discussion of the related Art
The wikipedia page on month 7, 18 of 2019 states that Cyc is the longest term of artificial intelligence in the world, trying to assemble a comprehensive ontology and knowledge base that spans the basic concepts and "rules of thumb" on how the world works (thinking about common knowledge, but focusing more on something that is rarely written or spoken, in contrast to the fact that people might find somewhere on the internet or via a search engine or wikipedia), in order to enable AI applications to perform human-like reasoning and be less "fragile" in the face of new situations that are not anticipated.
The wikipedia page on Cyc at 7.18.2019 states that Cyc project targets millions of pieces of knowledge that make up the common sense of man in a machine-usable form; this requires all the way (1) to develop a fully expressed representation language CycL, (2) to develop an ontology across all human concepts, down to some appropriate level of detailed information, (3) to develop a knowledge base based on the ontology framework, including all human knowledge about these concepts, down to some appropriate level of detailed information, and (4) to develop an inference engine much faster than those used in the then-current conventional expert systems, in order to be able to infer conclusions of the same type and depth that humans can make in view of their understanding of the world.
The wikipedia page on 18, 7, 2019 states that most of Cyc knowledge, except mathematics and games, is true by default; for example, cyc knows that by default, parents love their children, you will smile when you are happy, take your first step is a big achievement, let you happy when you love people have a big achievement, and only adults have children; when asked if the picture entitled "someone looks at his daughter to take her first step" contains a smiling adult, cyc can logically infer that the answer is affirmative and "show its work" by gradually exposing logical arguments using these five points of knowledge from its knowledge base: these are expressed in terms of CycL, which is based on predicate calculus and has a syntax similar to that of the Lisp programming language.
The wikipedia page at 7.18.2019 indicates that Cyc project has been described as "one of the most controversial efforts in artificial intelligence history"; CEO of Lu Minuo cable (Luminoso) Catherin Havasi Wo Xi (Catherine Havasi), cyc is a precursor item of Watson by IBM; the machine learning scientist petalogs (petro dominogos) refers to this project as a "catastrophic failure" for several reasons, including the infinite amount of data required to produce any viable result and the inability of Cyc to evolve itself; the university of georget Mason (George Mason) economics professor Luo Bin hansen (Robin Hanson) gives a more balanced analysis: "of course, CYC items are criticized for many of their specific choices. Complaints about its similar logic and language-like representation, complaints about its choice of prototype cases to build (e.g., encyclopedia articles), complaints about its focus on answers rather than actions, complaints about its frequency of rebuilding and maintaining legacy system comparisons, and complaints about private versus publishing all content. But any large item like this will create such a dispute and any choice thereof is not obviously a serious error. They must start from somewhere and to me, they have now collected a very striking knowledge base in terms of scale, scope and degree of integration. Other architectures may work better, but if knowing something more or less as important as the lynus (Lenat) considers, i want the AI to carefully try to import the CYC's knowledge and translate it into a new representation. No other source could be as much as CYC scale, scope and degree of integration. "
The True knowledges system provides open domain question answers using structured Knowledge and inference. In a knowledge system, knowledge in a knowledge base is represented in a single unified format: designated relationships between designated pairs of entities called "facts". Facts and relationships themselves are first-class entities, thus fully supporting facts about facts and facts about properties of relationships (Tunstall-Pedoe, W. (2010). True knowledgel: open-Domain Question Answering Using Structured Knowledge and Inference. AI Magazine,31 (3), 80-92.Https:// doi. Org/10.1609/aimag. V31i 3.2298).
Disclosure of Invention
According to a first aspect of the invention, a computer-implemented method for automatically analyzing or using data comprises the steps of:
(a) Storing in memory a structured machine-readable representation ("machine representation") of data conforming to a machine-readable language; the machine representation includes: a representation of user speech or text input for a human-machine interface;
(b) The machine representation is automatically processed to analyze the user speech or text input.
In a second aspect, a computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes: a representation of user speech or text input for a human-machine interface;
(b) The structured representation is automatically processed to analyze user speech or text input for the human-machine interface.
These aspects of the invention may be implemented in a voice assistant or chat robot; the technical advantage achieved is the ability to expand voice assistants or chat robots more widely and quickly; the invention enables the voice assistant and the chat robot to answer the questions with wider range and answer more accurately; the invention makes it easier for a voice assistant or chat robot to work with a large number of different natural languages.
An interpretation point: the conjunction "or" should not be interpreted narrowly as indicating mutual exclusivity, but rather shall include inclusion. Thus, the phrase "user speech or text input" means "user speech itself, or user text itself, or user speech and user text". When the conjunctive "or" indicates exclusivity, the phrase "or … or" is used.
Machine-readable languages are very expressive, but also very simple; simplicity requires less computer processing and thus delivers faster performance. Further details of the invention are in the appended claims.
According to another aspect of the invention, a computer system is provided comprising a processor and a memory, the processor being configured to answer a question, the processor being configured to use a processing language, wherein the semantic nodes are represented in the processing language, the semantic nodes comprising semantic links between the semantic nodes, wherein the semantic links are themselves semantic nodes, wherein each semantic node marks a specific meaning, wherein a combination of the semantic nodes defines a semantic node, wherein expressions in the processing language can be nested, wherein the question is represented in the processing language, wherein the reasoning step is represented in the processing language to represent semantics of the reasoning step, wherein the computing unit is represented in the processing language, wherein the memory is configured to store the representation in the processing language, and wherein the processor is configured to answer the question using the reasoning step, the computing unit and the semantic nodes and to store the answer to the question in the memory.
One advantage is that since the semantic links between semantic nodes are themselves semantic nodes, the semantic links and semantic nodes do not need to be processed in a significantly different way, which simplifies the processing, which speeds up the response time, which is a technical effect.
One advantage is that, because semantic nodes are very widely used in processing languages, processing in processing languages is accelerated, which speeds up response time, a technical effect.
Technical effects operate at the architectural level of a computer system; that is, no matter what data is being processed, effects are produced.
Technical effects cause the computer system to operate in a new way because the computer finds answers to questions faster than using prior art methods, because semantic nodes are very widely used in processing languages, which means that the processing of processing languages is speeded up.
The processing language contributes to the technical features of the present invention, as it produces technical effects: the processor processing the processing language has the effect that the computer finds the answer to the question faster than using prior art methods, because the semantic nodes are very widely used in the processing language, which means that the processing of the processing language is speeded up.
The computer system may be configured to output answers to the questions.
The computer system may be configured to output answers to the questions to the display device.
The computer system may be one such system: wherein expressions in a processing language can be nested without limitations inherent to the processing language.
The computer system may be one such system: wherein the semantic nodes each comprise a unique identifier.
The computer system may be one such system: wherein the computing unit is a semantic node.
The computer system may be one such system: wherein the problem is represented in the treatment language by: a paragraph including a semantic node that identifies the paragraph as a question; a list of semantic nodes representing zero, one or more of the unknown entities being questioned; and at least one further paragraph representing the semantics of the problem in the context of zero, one or more unknown entities.
The computer system may be one such system: wherein the processing language is a general purpose language.
The computer system may be one such system: wherein the processing language is not a natural language.
The computer system may be one such system: wherein the problem relates to searching and analysis of documents or web pages, wherein the semantic nodes comprise representations of at least parts of the documents or web pages stored in the document storage area.
The computer system may be one such system: wherein the problem relates to a location-based search using map data represented as semantic nodes in a processing language.
The computer system may be one such system: wherein the problem relates to searching for defined advertisements or news, wherein the semantic nodes include representations of advertisements, news articles, or other information items.
The computer system may be one such system: wherein the question relates to a request for a summary of a news topic, wherein the semantic node comprises representations of news from multiple sources, e.g. to provide a summary or an aggregation of news.
The computer system may be one such system: wherein the problem relates to a request for compatibility matching between persons, wherein for a plurality of persons the semantic node comprises a representation of personal information defining one or more attributes of the person.
The computer system may be one such system: wherein the problem relates to compliance with requirements for preventing abuse or illegal social media posts, wherein the semantic nodes include representations of the social media post posts.
The computer system may be one such system: wherein the questions involve analyzing customer reviews, wherein the semantic nodes comprise representations of the customer reviews.
The computer system may be one such system: wherein the problem relates to a user's product request, wherein the semantic node comprises a product description and a representation of the user's product request.
The computer system may be one such system: wherein the problem relates to a job search, wherein the semantic nodes include representations of job descriptions and skills and experiences of job seekers to determine which job seekers match job descriptions or to determine which job descriptions match skills and experiences of job seekers.
The computer system may be one such system: wherein the problem relates to the health of the individual, wherein the semantic nodes comprise health data related to the individual, and health data related to the human.
The computer system may be one such system: wherein the problem relates to nutrition, wherein the semantic nodes comprise nutritional data for food and beverage.
The computer system may be one such system: wherein the problem relates to accounting or finance, wherein the semantic node comprises a representation of finance or accounting information.
The computer system may be one such system: wherein the question is received by a voice assistant or chat robot, wherein the semantic node comprises a representation of user speech input for the human-machine interface and comprises a representation of the human-machine interface itself.
According to a further aspect of the invention, there is provided a computer implemented method using a computer system comprising a processor and a memory, the processor being configured to use a processing language, wherein semantic nodes are represented in the processing language, the semantic nodes comprising semantic links between semantic nodes, wherein the semantic links are themselves semantic nodes, wherein each semantic node marks a specific meaning, wherein a combination of semantic nodes defines a semantic node, wherein expressions in the processing language can be nested, wherein problems are represented in the processing language, wherein inference steps are represented in the processing language to represent semantics of the inference steps, wherein the computing unit is represented in the processing language, wherein the memory is configured to store representations in the processing language, the method comprising the steps of:
(i) The processor answers questions using the reasoning step, the computing unit and the semantic node
(ii) The processor stores answers to the questions in memory.
Advantages include the advantages of the previous aspects of the invention.
The method may be one in which the problem is represented in a processing language by: a paragraph including a semantic node that identifies the paragraph as a question; a list of semantic nodes representing zero, one or more of the unknown entities being questioned; and at least one further paragraph representing the semantics of the problem in the context of zero, one or more unknown entities.
The method may be one in which unknown items in the question are identified and paragraphs that constitute the subject of the question are selected for further analysis; processing begins with a paragraph list of subjects and selected unknown items from the question; selecting a first paragraph in the paragraph list for processing; processing a single paragraph includes three methods: using statically stored processing language paragraphs, using the computing unit, and using processing languages generated from reasoning:
the first method is to find out whether there are any paragraphs in the paragraph store that can be mapped directly with the paragraph being processed; if a paragraph has exactly the same structure as a paragraph in the paragraph store, all nodes except the unknown item match, then the value to which the unknown item matches is a valid result;
the second method is to check whether any results can be found by executing the calculation unit; checking whether the paragraph matches any paragraph in the computational unit description; all non-unknown nodes in the paragraph being processed must match the same node in the corresponding location in the computation description or be aligned with the computation input unknown item; the unknown item being processed must be aligned with the output unknown item in the description; then invoking a calculation unit to obtain a valid output value of the unknown item of the processed paragraph;
A third method is to see if the paragraph can be proved by applying any reasoning steps; searching the reasoning step to find where to reason about the paragraphs in the second half of the paragraphs may be unified with the paragraphs being processed; all nodes and structures must be equal between the two paragraphs except for the unknown item in the focus paragraph or the inference paragraph; if such an inferencing paragraph is found, it means that the inferencing step can prove that the paragraph is being processed; when matching an inference paragraph, a multi-stage process is used to first find any mapping of unknown items in the processed paragraph; secondly, find a mapping of unknown items used in reasoning paragraphs through the mapping with the paragraphs being processed; this mapping can then be applied to the first half of the inferred paragraphs to generate paragraph lists that, if matched with the known or generated processing language and the mapping found for them, will prove and find a valid mapping for the focus paragraph; solutions to the paragraph list can then be found recursively.
The method may use the computer system of any of the previous aspects of the invention.
Aspects of the invention may be combined.
Drawings
Aspects of the invention will now be described, by way of example, with reference to the following figures, in which:
FIG. 1 illustrates an example screen output for notification of job matching.
FIG. 2 illustrates an example screen output for a description of job matching.
Fig. 3 shows an example conversation within an app in which nutritional data is being communicated with the app.
Fig. 4 shows some example insights that may be derived from a time period showing horizontal health and nutrition data.
Fig. 5 shows an example graph showing daily intake calories versus dissipated calories, which is a very common thing for some people to track if the user aims at weight loss (or gain).
FIG. 6 illustrates an example of a visualization that can be generated from an example of the present invention: which is comparing estimated caffeine in the user's body while they are sleeping in bed with the calculation of sleep quality.
Fig. 7 shows an example of the generated interpretation. (a) shows an example of simplified interpretation; (b) examples of detailed explanation are shown.
Fig. 8 shows an example of a voice assistant product, referred to herein as "Brian," and how it accommodates a broader UL platform and other applications built on the UL platform.
Fig. 9 shows an alternative to the example of fig. 8.
Detailed Description
Examples of the present invention include systems and methods for creating and using structured representations of data that are intended to be expressive in breadth as natural language, but can also be handled and understood by automated systems. The representation referred to herein as UL (universal language) is a preferred example. Examples of the invention include systems and methods related to specific HUB and other applications, as well as systems and methods for processing, storing, and utilizing UL.
In addition to delivery of HUB applications, the use of UL brings other advantages. For example, the UL can be used as a way for an automated system to understand the world, as well as for an automated system to reason. Since the reasoning steps are recorded in language, such a system can also fully interpret itself in language to human users. The requirements for a software system may be written in UL (herein called "principles"), and examples of the invention may use these principles directly to decide on their own actions, rather than using a human programmer to try to predict all possible scenarios and code them ahead of time to determine their detailed actions by program code.
Language representation: UL (UL)
One goal of the UL is to be able to represent anything in principle that can be expressed in any natural language. Another object is that anything expressed in natural language can thus be translated into UL. The UL is intended to be able to be used by the machine, so this data representation must be fully processable and understandable by the automated system. While one goal, examples of the present invention may not fully honor these goals, while still having significant advantages over the prior art.
There are many different characteristics of natural language that make it extremely difficult for a computer to understand and process it. These include ambiguity-i.e., words in natural language typically have many meanings or meanings, some are related and overlapping, and some are quite different, and people need context, years of natural language experience and common sense knowledge to understand which meaning is intended. For example, the english word "pen" may mean writing utensils, pens for farm animals, female swans or prisons ("penitentiary" abbreviations), and other meanings. As a verb, it also has many related and different meanings. Despite this complexity, humans can naturally infer the intended meaning of words using context, real world experience and common sense knowledge as they appear, but this is extremely difficult for machines lacking the necessary skills.
The ambiguity and flexibility of word ordering is another problem that makes natural language processing extremely difficult for machines. Even simple sentences can be expressed in literal tens of ways conveying the same meaning. Humans naturally understand these numerous expressions, but this is not easily expressed in algorithms. Clauses and words within a sentence are modified and appended to other parts of the sentence, changing their meaning in a way that humans can understand naturally but do not follow the explicit rules that machines can follow. Natural language also uses techniques such as back-pointing to refer to entities referenced in early speech and avoid speaking them again (e.g., using pronouns and words such as "he", "she", "it", "the" etc. in english). Humans naturally understand what is referred to, but this is not so obvious for machines. Ambiguity may even arise from where boundaries exist in the compound noun-e.g., the use of the compound noun "fret flies" in sentence "Fruit flies like a banana (drosophila likes bananas)" as compared to sentence "Time flies like a narrow (photo-arrow)".
UL is designed to overcome all these problems and create a language that can be parsed, understood, and processed by machines, and thus can store an extremely broad range of information in a manner that can be understood, processed, and interpreted by machines.
Semantic node
One key component of the UL is the semantic node we speak of herein. Semantic nodes are intended to be the broadest conceivable definition of what can be defined-any thing with a word as well as things without a natural language name can have semantic nodes in the UL.
In various examples, the semantic nodes may include each specific human, the concept of a human (any specific human being is a member thereof), each file, each web page, each audio recording or video, specific relationships (including relationships linking any specific human to the concept of a human), attributes, specific types of language nuances, and each row and item in the relational database table.
Once defined, the semantic nodes have identifiers and can thus be referenced in the UL. In the preferred example, the ID is a 128-bit version 4UUID (RFC 4122), having a hyphenated (hypo-treated) lower case syntax. For example: b1c1cb5f-248f-4871-a73f-900d29066948. The preferred example also allows Unicode strings in double quotes to be their own ID-the string itself is the ID of the semantic node of the particular string. For example, "coada "is a valid semantic node that represents only a string of characters and not the concept of a beverage. In other examples, a UUID or other identifier may be used for the string to further simplify the syntax, but additional language is required to express the link between the identifier and the string it represents.
In an example, one simple grammar of UL is thus:
<passage>::=<128bit UUID>
<passage>::=“<Unicode string>”
<passage>::=(<passage><passage>*)
where < passage > is zero or more further < passage >, and double quotation marks within Unicode string are escape \ ".
In an example, the semantic node may be represented with a 128-bit UUID or a string;
a paragraph may be (a) a semantic node or (b) two or more other paragraphs in brackets.
In another example, the minimum number of paragraphs that can be grouped in brackets is two, so the third line of the above-described grammar would be < passage > = (< passage > < passage > < passage > -passage >
In the preferred example, a given semantic node is typically labeled with a specific thing or a specific meaning. Although ambiguity is allowed in the preferred example, in the case of a substantially unlimited pool of available UUIDs, there is no reason to overload a node with more than one meaning, and in practice, all possible meanings of words in natural language will be given to different semantic nodes. The closely related meaning of words can also be given to different semantic nodes and describe their relationships with paragraphs. This use of unique semantic nodes for each possible meaning overcomes the complexity and ambiguity of determining the meaning of natural language.
The use of semantic nodes also avoids any ambiguity from tagging concepts with multiple words in natural language. In a preferred example, compound nouns, verb phrases, prepositional verbs, etc. are not typically present—each such concept has a single node, and the machine has no challenges in deciding where the boundaries of the representation are located.
Node protocol
We use the term "user" herein to mean any human, organization, or machine user of the examples of the present invention. The user may be any computer system using examples of the invention or any human or organization using UL. It may also be a subsystem of a larger computer system.
In a preferred example, if two semantic node identifiers are different, they may or may not be the same concept-because two different users of the present invention may have selected two different IDs for the same thing. If the two identifiers are identical, they must be marked with the same thing, defined in the same way, by design, in the preferred example. Thus, when agreements are made on semantic nodes and employed across different systems, the UL becomes useful for communication. When there is enough knowledge of semantic nodes to be able to express them in natural language and generate natural language with similar meaning to UL, UL may become understandable to human users.
In a preferred example, the meaning of a semantic node comes only from other UL's that represent what has been said about the node. Sometimes this may represent knowledge of UL expressions, i.e. semantic nodes exactly correspond to words or meaning of words in a specified natural language. For example, spanish beverage Sang Geli (Sangria) may be expressed as a6ba9f28-b54d-4e4a-8cf8-ad4e07659004. The pen in the sense of writing an appliance may be denoted as c092849c-80ed-4a69-9a4e-2704780f0cea, but the concept of pen in the sense of a pen for farm animals will have completely different nodes, such as ba9b43a3-540d-44ff-b6fe-62dcfb9dda c. Although these meanings may be documented somewhere for human users, the paragraph of the UL will define these concepts and semantically link them to other concepts that give them meaning. For example, in the case of the Sang Geli early concept, the paragraph can assert that it is a beverage, an alcoholic beverage, which originates in spanish. A paragraph may further define ingredients or other information related to what the machine understands it.
As used herein, a "shared ID" is an ID used by more than one user of various examples of the present invention. Typically, one user has created and used an ID, and a second or more users have decided that this ID represents the concept they also want to use, and then started to use it as well. A "private ID" or "local ID" is similarly an ID that is used by only one user and is not issued or exposed to other users. The "public ID" is the ID that each user can see that the user has used in the UL, and whether to share depends on whether other client entities have started to use it. According to various examples, an ID may be shared among multiple users without being entirely common.
In other words, any user of the present example can create their own semantic nodes with their own local meaning by picking up unused identifiers. For example, the application may assign a semantic ID to a row of a particular local database table. Any number of different IDs may represent the same thing. However, when semantic nodes are shared, their meaning is shared. If another user subsequently uses these IDs elsewhere, they will mean and mark the same thing. In the preferred example with a 128-bit address space, if not intentional, the random selection of an ID from that space has a substantially zero collision probability, thereby creating and using a local ID without going through any type of registration process or communication or coordination with any other user. In another example, a string identifier may be used, and the user may include a unique substring in their own local identifier, e.g., using it as a prefix. For example, an organization may select a unique prefix, such as "unlikelyai719", and then name all its nodes at this beginning, such as "unlikelyai781—sangria" -with the unique prefix, it may be ensured that its local id is not replicated by any other user. Other examples may use a smaller address space and have a more centralized approach that may include registration.
Character string
As previously mentioned, in the preferred example, the Unicode string also represents a semantic node. Their meaning is strictly speaking just the character string itself. Any natural language meaning contained in the string is not part of the meaning of these IDs. I.e. "Sang Geli herm (Sangria)" strictly means the letter sequence s.a.n.g.r.i.a-rather than the concept of a beverage. Following the principles of the node protocol discussed above, it is also possible that the string is represented by an ID as an additional identifier. For example, the string "Sang Geli early (Sangria)" may be additionally represented as fc570fba-cb95-4214 bc45-8deb52d830a5, and this is represented in paragraphs or elsewhere. This can be used for very large strings. Following the same design principle, two identical strings used as semantic nodes have a common meaning with the string.
Combined node
The combined semantic nodes also define semantic nodes. The sharing rules surrounding the meaning of the sharing node or sharing class to which the node belongs define the further sharing meaning brought by the combination. For example, semantic nodes within an infinite class may be represented by combining one or more nodes defining the class with one or more strings, where there is a representation in the form of internationally recognized strings. For example, integers may be defined as such. (< integer id > "5"). Another example of a combined node having more than two nodes therein is (< id of integer > < id of chicago > < id of new york city > < id of london >), which is a single semantic node representing a set of three cities considered as a single entity. The combining nodes may contain any finite number of semantic nodes, and the semantic nodes within these combining nodes may also be combining nodes that create any level of nesting.
Nesting arrangement
UL syntax allows expressions to nest indefinitely. This allows the user to define a concept, along with contextual information about the concept, as a hierarchy of UL expressions under the same parent UL expression or with the same center expression or paragraph. The context may be used to provide nuances, sources, beliefs, time availability, etc. For example, beginning with paragraph (HoldsOffice JoeBiden UsPresident), where HoldsOffice JoeBiden and UsPresident are human-readable "nicknames" of ID, described further below, another paragraph stating when it is authentic may be (HasTemporalValidity (HoldsOffice GeorgeWBush UsPresident) (DateIsoFormat "2021-07-23")) -say Qiao Bayen (Joe Biden) at 2021, 7, 23, about any U.S. president. Further expressions surrounding this paragraph may assert that the statement is from a particular source or has a degree of reliability, etc.
Combined protocol
In a similar manner to the principles of node protocols, where the use of the same semantic node by the same or different entities implies the same meaning between uses, the meaning brought by combining shared semantic nodes is also generic. Any client entity that chooses to create a paragraph using shared semantic nodes also expresses the same meaning by combining them. Similarly, any customer entity is free to define its own meaning for combinations of semantic nodes that are not used elsewhere.
In other words, further meaning is brought about by combining semantic nodes. Also, if semantic nodes are shared, the meaning brought by combining them is shared. In a preferred example, the semantic nodes can be combined in any number and level of nesting and no further syntax is required. Other examples may include additional syntax. The combining and nesting in this and the preferred examples is done in brackets when UL is displayed. However, various examples and example implementations may represent combinations of nodes in other ways. For example, the syntax may group nodes with characters other than parentheses, there may be special functional words in the syntax, such as "queries", which are not semantic nodes but define different types of paragraphs, or there may be special syntax, such as unknowns or boolean values or integers, for particular types of nodes.
The combinations in UL in the preferred embodiment are expressed directly in brackets. When there is no bracket, there is no ambiguity as to how the nodes group or any hypothetical group. This syntax thus avoids the extremely common ambiguity in natural language, where clauses and words group and modify other parts of sentences in a way that is extremely difficult for the machine to determine. For example, the english sentence "police arrest the crosstalker because they worry about violence (The police arrested the demonstrators because they feared violence)" it is unclear whether they worry about violence (they feared violence) "is applicable to the crosstalker or to the police. Natural language does not have strict packet syntax, meaning that this type of ambiguity is common. Humans with a rich world experience can be deduced from common sense and their world knowledge, in which case police officers are more likely to be available. The unambiguous grouping in the UL ensures that the machine can always be correct.
UL syntax
The UL expression is a semantic node or paragraph. Both variants are atomic, complete, effective UL fragments. Semantic nodes are fully defined above and include either UUIDs or Unicode strings. A paragraph is any combination of nodes and it is a unique nested structure of the UL. A paragraph can be anything from a single semantic node to a complex hierarchy representing the entire book.
Thus, in a preferred example, a simple more formal grammar of the UL may be:
the above grammar uses symbols (sums) to mark the packets and implementation specific details, such as character escape within the string_literal node, are omitted.
Various examples of the invention may extend or alter the above-described grammars to meet their needs. The possible extensions include additional syntax for a particular type of node: e.g., integer, real, point in time, unknown. Possible other extensions include notes and "specified paragraphs" that are ignored by any UL parser. Specifying a paragraph may represent the manner in which the paragraph is linked to a nickname in a human-readable language, such as english. Nicknames are further described in the following sections.
Note that the preferred example uses this extremely simple syntax without extensions and there is no special additional syntax for the different types of paragraphs-meaning purely from the choice of semantic nodes and how they are grouped in terms of both grouping and ordering. It would be a significant advantage to be able to express anything with such a simple representation, as compared to alternative methods that use more complex syntax or use special additional syntax for a particular thing. These advantages include simplicity and versatility of implementation, which can result in significant speed improvements in processing languages. This also greatly simplifies the UL's storage and retrieval and its ability to handle it. For more complex syntax, different code needs to handle each of the different types of syntax. The storage is also very complex. More complex storage also increases the complexity of the code that needs to access it and the speed at which it is accessed.
Nickname
To make the UL more understandable to humans, various examples have a "nickname" scheme, where shorter natural language tags are associated with UL expressions, and these names are used when presenting the UL to humans. These tags can be used as substitutes for the original UUID when displaying UL to humans. In a preferred example, any number of nickname schemes may be defined so that users familiar with different natural languages can access the meaning in the UL. An example nickname scheme used herein is referred to as engish 1. In some examples (but not preferred examples), these nicknames may be used as identifiers. In a preferred example, these nicknames are not identifiers, but merely the manner in which the identifiers are presented to a human being.
As an example of UL representation, this is a valid paragraph:
(03d206a2-52ca-49e1-9aeb-86364e2dead6
cb75d6f8-16d9-4a36-8c16-7195182d4057(d1fd5662-c88e-4d94-b807-5310483df8cd(30847a3d-e43c-4229-993e-20ad01adc126
5a533842-bcd8-4125-8b39-2b1caa643593)))
the meaning of this paragraph corresponds to the english language "bries a stream, french cheese".
These nicknames are assigned to semantic nodes within the paragraph as follows:
ISA=03d206a2-52ca-49e1-9aeb-86364e2dead6
brie=cb 75d6f8-16d9-4a36-8c16-7195182d4057
Cheese (chese) =d1fd 5662-c88e-4d94-b807-5310483df8cd
Butter (stream) = 30847a3d-e43c-4229-993e-20ad01adc126
French (French) =5a 533842-bcd8-4125-8b39-2b1caa643593
Meaning that the above paragraphs may be displayed in a more human readable form, so:
(IsA Brie(Cheese(Creamy French)))
cream and cheese are semantic nodes that are characteristics or attributes of other semantic nodes. This concept (given a nickname attribute) is a semantic node that corresponds to a class of such nodes and is shared. The part of the shared meaning in the preferred example is that two or more properties of the combination form a new property, where all composition properties apply, so (Creamy French) itself is the meaning of the properties in English "cream and French (Creamy and French)".
Cheese is a type of thing that means cheese. Another shared meaning from class and property is that combining a class with a property gives a class of everything in the class that has the property, and thus (Cheese (Creamy French)) is all cream, a class of French cheeses-english is "cream, french cheese".
IsA is a semantic node that has a shared meaning by combining the semantic node with a class, meaning that the node is a member of that class, so (IsA Brie (Cheese (Creamy French))) can be translated into English "Bries a cheese (French cheese) that is creamy in Brie".
Also, in the preferred example, the selection of an englist 1 nickname is helpful by selecting the name of the English language that corresponds to the meaning of the near node. However, the meaning of a node comes only from its use in the system.
Negative receipt
To say that something is Not true, the combination of the semantic node Not and another relationship node defines a relationship node, which is true as long as the original relationship is false.
For example
((Not IsA)Bat Bird)
Is a true statement.
Problem(s)
A question may be represented in the UL by combining a node that identifies a question that paragraphs as a list of zero or more unknown items with a further one or more paragraphs that use zero or more unknown items to represent the semantics of the question. In a preferred example, the paragraph of UL has the form ((Question < unowns >) (< passage >)) where Question is a semantic node and < unowns > is a list of zero, one or more semantic nodes representing unknown values (similar in meaning to letters of the alphabet in algebra), and where < passage > is where unknown items are used to express the content under Question. In the preferred example, these unknown items are only semantic nodes, i.e., members of the class of unknown items-they have no special state other than their meaning. Here we use Unknown1, unknown2, X, etc. as nicknames for members of this class. Note that the problem in the preferred example is the UL paragraph as any other problem: the syntax of the UL is not extended or changed in any way to support the problem.
For example
(Question Unknown1)((HasAttribute Unknown1 Alcoholic)(IsA Unknown1Drink))
What beverage is an alcoholic beverage is translated into english? (What drinks are alcoholic
The yes/no question has zero unknown terms, so both
((Question)((IsA Sangria Drink)(HasAttribute Sangria Spanish)))
((Question)((IsA Sangria(Drink Spanish))))
Is translated into "Sang Geli is spanish drink? (Is Sangria a Spanish drink)?
Another example is:
((Question)(WithinRange(Integer"7")(AtLeast(Integer"5"))))
is this in question "7 in the range of 'at least 5'? (Is 7within the range'at least 5'.
An example problem with using unknown terms (mapped to output) is:
((Question Unknown1)(IsA Unknown1(Cheese Creamy)))
this is the question "what is cream cheese? (What are the creamy cheeses? (List creamy cheeses. Unknown unnknown 1 is used in the question and any node that can correctly map from what is represented in the UL to unnknown 1 is returned as an appropriate output for the question.
Examples of UL questions that would give No (No) as answer are:
((Question)(HasAttribute Cheddar Creamy))
this is the question "is cheddar cheese cream? (Is cheddar creamy. In examples where a. Cheddar cheese is known to be hard cheese and b. Hard cheese is not cream, the system may prove that cheddar cheese is not cream and will return a "no" result.
Reasoning
The reasoning is to generate UL from other UL. The reasoning paragraph is part of the UL, indicating how to generate new UL from other UL-e.g. logical consequences or giving meaning to other nodes or combinations of nodes. For example, english "if something originates from French, then it is French (if something originates from France then it is French)" translates into an inference paragraph in the UL.
The inference step is represented as a paragraph representing the semantics of the step. Note that in the preferred example, the inference paragraphs are represented in UL as anything else. There is no special syntax or content that extends or alters the UL to support reasoning. For example:
(ConsequenceOf(IsA X(Cheese Hard))((Not HasAttribute)X Creamy))
that is, "if X is hard cheese, then X is not cream. (If X is a hard cheese then X is not streamy.) "in the preferred example, these inference steps start with a ConsequeneOf semantic node. Then a section indicating that a jake condition is required for this step to be used ("if" section). This may be just one paragraph, as in this example, or it may be a paragraph containing other paragraphs, which would all require a gatekeeper. The third element is the real paragraph if the condition is met ("then" paragraph). This may also be one of a plurality of paragraphs, in which case all paragraphs contained would be true if the condition is met.
Further examples of reasoning steps:
(ConsequenceOf (IsIn X Y) (IsA Y GeographicalArea)) English is "if X is in Y then Y is position (If X is in Y then Y is a location)" (IsIn represents geographic position)
(ConsequenceOf (IsA X Continent)) ((IsIn X Earth)) means "if X is continent, then X is on Earth (If X is a continent then X is in Earth)" in English "
Calculation unit
The computing unit is the way that the example of the invention allows for the computation required for reasoning and other intentions to be represented and used. Any automated process that calculates something or returns a result may be supported with this technique. Various examples also allow them to support actions such as completing transactions, turning on lights, playing music, etc.
As a simple example, according to various examples, the computing unit may allow UL answers such as "7 is greater than 5? "problem. Clearly, it is impractical to have to explicitly input paragraphs for each combination of two integers. In a preferred example, the computing unit may be considered a semantic node, which is an example of a computation unit class. We can then add a paragraph representing detailed information of the unit that is needed to select and run it: it can do what, how it runs, and how to interpret the results.
For example, this is an example calculation unit for addition:
(ComputationLocalJavaLocation AdditionComputationUnit"ai.unlikely.questionprocessor.computation.Arithmetic$Addition")
(ComputationInputs AdditionComputationUnit InputOne InputTwo)
(ComputationDescription AdditionComputationUnit((Question Unknown1)((Equal(RealNumber Unknown1)(Add(RealNumber InputOne)(RealNumber InputTwo))))))
(ComputationDescription AdditionComputationUnit((Question Unknown1)((Equal(Add(RealNumber InputOne)(RealNumber InputTwo))(RealNumber Unknown1)))))
in this case, there are two paragraphs describing the calculation unit. The description paragraphs have a header node computation description, followed by nodes of the units they describe, and then paragraphs of classes of UL questions that they can help answer. We also have a paragraph of computation units describing their inputs, in which case we say that two inputs are required to compute the function of the addition. The description paragraph uses these inputs to describe where they appear in the question. The last paragraph we need to work with the computing unit is the paragraph given the location. In this case we use computation localjavalocation at the head of the paragraph, which means we are describing Java classes that are available locally for use by the problem handler at run-time. From all this information, the system can recognize when computation is needed and find the best way to get an answer to it and compute it. The preferred examples represent various ways of invoking the computing element. Each method can be described in a similar paragraph but with a new head node and a different way of describing the location and a way of invoking the computation. For example, if we want to add a compute engine using Lua script, we can add paragraphs such as:
(ComputationLuaScript AdditionComputationUnit"a,b=io.read('*n','*n')\nio.write(a+b)")
To assist the engine in calculating units in this way. Another example is to use an API endpoint and specify a URL that will return results through GET requests.
Other examples of computing units are described herein to further illustrate this concept:
(IsA GreaterThanOrEqualComputationUnit ComputationUnit)
(ComputationLocalJavaLocation GreaterThanOrEqualComputationUnit"ai.unlikely.questionprocessor.computation.Comparison$GreaterThanOrEqual")
(ComputationInputs GreaterThanOrEqualComputationUnit InputOne InputTwo)
(ComputationDescription GreaterThanOrEqualComputationUnit((Question)((GreaterThanOrEqual(RealNumber InputOne)(RealNumber InputTwo)))))
for greater than or equal to the comparison: :
(IsA EqualComputationUnit ComputationUnit)
(ComputationLocalJavaLocation EqualComputationUnit"ai.unlikely.questionprocessor.computation.Comparison$Equal")
(ComputationInputs EqualComputationUnit InputOne InputTwo)
(ComputationDescription EqualComputationUnit((Question)((Equal(RealNumber InputOne)(RealNumber InputTwo)))))
equal comparison for real numbers.
Verifying whether UL has meaning/verifying paragraph
Just as non-meaningful words (such as nonsensical verses) can be written in natural language, the syntactically correct UL can also be written in a non-meaningful way. According to various examples, explicit rules identifying meaningful and non-meaningful UL syntax may be defined and automatically applied to determine invalid paragraphs. Note that invalidity is different from reality. Paragraphs may be valid, but still represent something that is not real.
For example, the IsA semantic node described above requires two additional nodes to form a paragraph, where the second node must represent a meaningful class. For example, (IsA Brie GatwickAirport has no meaning because gatwick airport is not a class (this paragraph would translate to english "brix is gartewack airport (Brie is a Gatwick Airport)") (IsA Brie Cheese OriginatesFrom) has no meaning because the IsA is followed by three nodes instead of two.
Verification may be performed by any machine-readable description of these constraints that may be automatically read and checked. In a preferred example, these are naturally done with other UL paragraphs describing these constraints (referred to herein as validation paragraphs). Typically, these constraints will be provided by an enterprise, organization, or individual defining the key nodes in the paragraph. In validating paragraphs, examples of the invention may look at each paragraph and sub-paragraph and search for applicable validation paragraphs based on semantic nodes in those paragraphs. By checking these validation rules against an examined paragraph, a view of whether the paragraph is invalid can be determined.
An example of a Validation paragraph on node IsA is (Validation (IsA Unknown1 Unknown 2) (IsA Unknown2 Class)), which is a paragraph of given form (IsA < node1> < node2 >) (Question) (IsA < node2> Class) should return Yes (or not return No). If No is returned, the paragraph is invalid. To constrain the number of nodes that can follow a given node, the paragraph of (ValidationCount IsA (intelger "2")) states that IsA nodes must have exactly two semantic nodes following them to have meaning. Variations of these examples may define further constraints on meaningful paragraphs. In another example, (HasSchema ExpressedInEnglish (Schema ExpressedInEnglish Node String)) is an alternative way to express that an expressendlink paragraph expects two additional nodes, the first in the node class (any node) and the second in the string class.
Verification may be used to check or select a valid UL generated during a statistical translation method that may not always generate a valid UL (as described herein), and check a UL entered by a human. It may also be used to find contradictions or other logical inconsistencies in paragraphs. For example, a paragraph that uses asserting another relationship with similar or related semantics to the IsA may conflict with a different expected verification paragraph that describes semantic nodes that can be grouped with that relationship.
UL variants
Many variations on the definition of UL are possible to those skilled in the relevant art. Variants may include selection of syntax, selection of representations, and selection of many other details of representations and implementations. As used herein and where appropriate, "UL" is intended to encompass not only the preferred examples described herein but also all such similar UL-like representations and variants.
Method for answering questions
In a preferred example, unknown items in the question are identified and paragraphs that constitute the subject of the question are selected for further analysis. The nodes following the problem node in the header of the problem paragraph are the unknown items for which we try to find a map so that they satisfy the main body of the problem. Successful satisfaction of subjects with what is considered to be a true UL is a way to answer questions.
In an example, the process begins with a paragraph list from the subject of the question and the selected unknown item. The first paragraph in the list is selected for processing. This process aims to find all possible mappings for unknown items that may be true for the selected paragraph.
In an example, processing a single paragraph includes three methods of using statically stored UL paragraphs, utilizing a computing unit, and utilizing a UL generated from reasoning:
the first method is to find out in the paragraph store if there are any paragraphs that can be mapped directly with the paragraph being processed. If a paragraph has exactly the same structure as a paragraph in the paragraph store, all nodes except the unknown are matched, then the value to which the unknown is matched is a valid result.
The second method is to check whether any results can be found by executing the calculation unit. We check if this paragraph matches any paragraph in the computational unit description. All non-unknown nodes in the paragraph being processed must match the same node in the corresponding location in the computation description or be aligned with the computation input unknown item. The unknown item being processed must be aligned with the output unknown item in the description. The calculation unit may then be invoked to obtain a valid output value of the unknown item of the processed paragraph.
A third method is to see if the paragraph can be proved by applying any reasoning steps. We find any reasoning steps to find where to reason about the paragraphs in the second half of the paragraph can be unified with the paragraph being processed. All nodes and structures must be equal between the two paragraphs except for the unknown item in the focus or inference paragraph. If such an inferencing paragraph is found, this means that the inferencing step can prove that the paragraph is being processed. When matching an inference paragraph, a multi-stage process is used to first find any mapping of unknown items in the processed paragraph. Second, a mapping of unknown items used in reasoning paragraphs is found by mapping with the paragraphs being processed. This mapping can then be applied to the first half of the inferred paragraphs to generate paragraph lists that, if matched with the known or generated UL and the mapping found for them, will prove and find a valid mapping for the focus paragraph. The solution to the paragraph list can then be found recursively using the method we currently describe. In some examples, we keep track of the inference depth (i.e., the number of inference paragraphs applied) that are currently being processed, and impose a maximum depth limit on how far we explore to do so for delay reasons.
The three methods may occur in any order and are independent of each other.
After finding a valid mapping list for the first paragraph in the list, we have to look at the rest of the list. If the list contains only one paragraph, the returned mappings are valid. Otherwise, we look at each solution returned and apply it to the rest of the list before it is processed. This will return a set of mappings, which can then be combined with the mappings given for the header to give the final complete mapping.
Some problems are yes/no and there are no unknowns for which we are trying to find a mapping. These problems are handled slightly differently. They are initially processed in the same way to see if we have paragraphs, reasoning steps or computational units that can prove that the problem paragraph is authentic. If this returns a successful result, we can return a Yes result. If no successful results are returned, we look at all paragraphs in the question and negate them using Not node. Each of these negative paragraphs is then processed to see if we can prove the negation of the original problem. If one of these returns a successful result, then a No result may be returned. If neither the initial processing nor the negative paragraph processing returns a successful result, this means that we cannot tell whether the problem is true or false. Therefore, we can only return one DontKnow result.
Problem handling example:
to further explain the process, the following is how the method achieves the results for a simple example. For this example, all nicknames used are valid, and the relevant UL paragraphs stored are those that exist within the trusted UL store, which is believed to contain only true factual statements:
(IsA X Unknown)
(IsA A Unknown)
(IsAB Unknown)
(IsA Cheddar Cheese)
(IsA Brie Cheese)
(OriginatesFrom Brie France)
(ConsequenceOf(OriginatesFrom X France)(HasAttribute X French))
(ConsequenceOf((IsAX A)(HasAttribute X B))(IsA X(AB)))
the problem is ((Question X) (IsAX (Cheese French)))
This translates into "list french cheese (List French cheeses)", in english.
X is identified as an unknown item that needs to be mapped and the paragraph to be processed is a list (shown here in brackets) [ (IsAX (Cheese French)) ]. From now on we will show it as [ (IsA X (Cheese French)) ] -X
(IsA X (Cheese French)) -X is treated. It cannot be matched directly with any paragraph nor any computational unit. However, it can be demonstrated by the reasoning paragraph (concequencof ((IsAX a) (HasAttribute X B)) (IsAX (AB)) because it matches the second half. The mapping is applied and processing continues, e.g., recursively.
[ (IsAX Cheese), (HasAttribute X French) ] -X was treated. The first paragraph is selected.
(IsA X Cheese) -X was treated. It can be matched to two paragraphs to give the mappings X- > Brie, X- > Cheddar. No computational unit or inference paragraph can be applied.
These mappings are then applied sequentially to the rest of the list. This results in the following paragraphs being processed:
(HasAttribute Brie French) -only the reasoning paragraphs can be applied (ConsequenceOf (OriginatesFrom X France) (HasAttribute X French))
(OriginatesFrom Brie France) this matches exactly with the trusted paragraph in the paragraph store, so we know that this is true. Thus, (HasAttribute Brie French) is also true.
This is combined with the above level mapping to give X- > Brie as an effective result.
Then:
(HasAttribute Cheddar French) -only the reasoning paragraphs can be applied (ConsequenceOf (OriginatesFrom X France) (HasAttribute X French))
(OriginatesFrom Cheddar France) this cannot be demonstrated by any means. Therefore, it cannot be demonstrated (HasAttribute Cheddar French).
Thus, this results in no result.
Thus, the complete process gives a single valid mapping of X- > Brie, which in turn gives Brie as the final answer.
In various examples, steps of answering a question are recorded to provide an explanation. For this problem, the raw output from an example performing this step is as follows:
results: yes
Solution: x- > Brie
Explanation:
(IsA Brie(Cheese French))
Known (IsA Brie Cheese)
(HasAttribute Brie French)
Known (OriginatesFrom Brie France)
The method outlined above for handling problems can also be used to solve cross-word cues, in contrast to conventional AI which is not suitable for this task. For example, "Creamy French cheese" may be a cross-word cue, and the above-described method enables the definition portion of the cross-word cue or mystery cross-word cue to be resolved; the thread may generate the answer "Brie".
The approach outlined above is a general procedure for handling problems in some examples, however further examples have the benefit that improvements to the system can be made for reducing latency.
One of these improvements is that a "dynamic programming" cache in memory can be used to store the resulting map of any paragraph with unknown terms calculated during the processing of the problem. Because of the nature of problem handling, exploring different inference branches may result in patterns that handle the same paragraphs and unknown terms. This caching means that each of these sub-problems need only be handled once, with subsequent attempts returning the map stored in the cache.
The purely recursive approach means that all data acquired from our database system must occur in sequence just before the data is needed, with all further processing having to wait. To reduce this bottleneck, the system can be modified in two ways. These modifications allow data acquisition and processing to occur asynchronously and in parallel as much as possible before the final processing step explores the data and builds the results.
When looking at paragraphs with unknown terms, the three phases outlined above (matching short circuits in memory area, acquiring and executing the computing unit, and acquiring the inference article) can be processed in parallel, with data acquisition being done asynchronously so that the processing thread is not blocked. The step of reasoning about paragraphs will then return other paragraphs with unknown terms that need to be processed, the result of which can be used to give the result of the initial paragraph. Such a junction tree may be stored and the processing of these sub-problems arising from reasoning may occur in parallel, allowing data acquisition and exploration of reasoning to be parallelized.
Once all paragraphs have been processed to a given maximum inference depth, a second non-parallelization step can be used to traverse the tree of processed paragraphs and unknown mappings to find valid answers. When viewing a paragraph list, where each paragraph now has its valid map from the paragraph store and computation, the valid map for that list is the following map: all unknown items have values and there is no contradictory mapping between paragraphs in the list. This step can recursively browse the data and find all valid mappings of the initial questions that can be returned as answers.
Various examples may selectively store at least some of the paragraphs that have been generated from reasoning or computation so that they are available for faster processing in the future. The history of these generated paragraphs is also stored in various examples so that changes in the trust in the paragraphs used to generate the paragraphs can be extended to give trust to these newly generated paragraphs.
Priority queue examples
An alternative example of the recursive system outlined above is to use a priority queue to control the order in which sub-queries are processed. This alternative uses the same three steps to process a given query paragraph, but differs in the way in which these paragraphs are selected for processing and the way in which sub-queries are stored. All query paragraphs revealed during processing are stored in the map along with any solutions found for the query. This data can then be looked up by querying the shape of the paragraph. The shape of a query paragraph is defined such that all unknown terms are considered equal, such that queries (IsA X Cheese) and (IsA Y Cheese) are considered to have the same query shape and are the same sub-queries.
In addition to this mapping, we maintain an ordered priority queue of sub-queries to be processed. The query paragraphs we wish to process are first run through our prioritization method outlined below to calculate the priority value of the sub-query. They are then placed on the priority queue such that the sub-query with the highest priority is at the front of the queue. The only query paragraphs initially added to the mapping and priority queue are paragraphs in the body of the incoming question. Processing begins with the highest priority query being fetched from the queue and processing follows the three steps outlined above.
The third step of processing the sub-queries will output a new inference step based on the inference paragraphs in the data, which is used to find a solution to the query. For example, a sub-query (IsA X (Cheese Creamy)) and an inference paragraph (ConsequenceOf (IsA X Y) (HasAlttribute X Z)) (IsA X (Y Z)) may result in an inference step:
if the paragraph: (IsA X Cheese) (HasAttribute X Creamy)
Then the paragraph: (IsA X (Cheese Creamy))
These inference steps are stored so that an inference tree of questions can be explored when combined with a query map. The if paragraph generated from this new reasoning step can then be added to the query map (if not already present) and its priority determined and added to the priority queue for processing.
For this example, when new solutions to sub-queries are found during processing, they are added to the data in the query map. When this occurs, we also look at the stored inference steps to see if any solutions can propagate towards the root of the inference tree. For example, if we already know that X has a solution { Brie } to the query (IsA X Cheese), and we find the solution x= { Brie, mashpopto } when processing the query (HasAttribute X Creamy), we can review the above reasoning steps. If we can find a value that satisfies the X of two if paragraphs, we know that it is the solution of the paragraph. In this example, the solution x=brie is a solution of two if paragraphs, so it can be added to the query map for then paragraphs (IsA X (Cheese Creamy)).
Optimization
Limitations may be imposed on the amount of work done to handle the problem in order to control the delay by limiting the number of sub-queries processed. This may be done in conjunction with or instead of the depth limitation.
This example allows for flexible parallelization of query processing. Instead of processing one query at a time from a queue, the system may process multiple queries simultaneously using multiple threads. Each thread may independently remove the next query to be processed from the queue, process it, determine the priorities of all resulting sub-queries and insert them into the queue. The thread may then obtain the next query to process from the queue.
The prioritization method comprises the following steps:
various methods of query prioritization are possible, the simplest being to determine the priority of a query based on its depth in the search tree. Each time the query paragraph is one inference step away from the initial question, the depth value of the query paragraph is increased by one. Using this prioritization allows the system to follow a breadth-first search pattern, processing all queries at a given depth before looking at the next step of the tree.
Alternative examples may consider many factors including depth, the inference paragraph used to create the inference step, the location of the paragraph within the inference step, and any solution that has been found for a parent paragraph or sibling if a paragraph within the inference step. This may allow for a "best-first" exploration of the search space with the objective of exploring the area that is most likely to provide a solution as soon as possible. This is beneficial because it can lead to faster processing of yes/no problems and improved processing capabilities when constrained by query processing limitations.
Using this prioritization scheme, the priority of queries can be changed due to solutions found elsewhere in the inference tree. Thus, when a new solution is found and added to the query map, we must trigger the re-prioritization of all sub-items of the query in the question by looking at the reasoning step disclosed by the "then paragraph" (the paragraph describing the consequences of the reasoning step).
Complicated reasoning steps:
some queries may result in sub-queries that contain more than one unknown term, such as (IsA X Y). These queries may return many solutions, depending on the data, and may result in slow processing times. The reasoning steps that we include these types of queries are referred to as complex reasoning steps. To overcome this problem we use an optimization of the complex reasoning step, where initially only if paragraphs containing one unknown term are processed. Any solution found for this unknown term can be substituted into a complex inference step to create a simple inference step with one unknown term that can be handled normally.
For example, query (IsA X Food) and inference paragraph (ConsequenceOf ((IsA X Y) (IsSubclassOf Y Z)) (IsA X Z)) are taken.
This results in a complex reasoning step:
if the paragraph: (IsA X Y) (IsSubclassOf Y Food)
Then the paragraph: (IsAX Food)
The first paragraph contains more than one unknown item and is therefore not added to the priority queue for processing, whereas the second if paragraph can be processed. When processing the second paragraph, we can find the solution of X { Cheese, nut }. The solution is substituted into the inference step to create a new simple inference step. Here we use the result Cheese as an example, but this can be done with all solutions of X.
If the paragraph: (IsAX Cheese)
Then the paragraph: (IsA X Food)
We can now treat this new reasoning step as normal and prioritize and process the query paragraphs (isax Cheese).
Thinking results
Question answering in various examples has been described. Some questions may potentially require a significant amount of inference effort to answer, and various examples may choose to limit the amount of reasoning done for a particular application in order to return results in a reasonable time. This may be done, for example, by limiting the number of sub-queries that are executed for a particular problem.
When answering a question, the same question may be asked again in the future, and various examples may choose to store the results of the query so that the same question may be answered more quickly the next time. Various examples may also extend this approach to sub-queries-save results as questions that the question processor asks during reasoning.
In an example, the results may be selected to be saved during response to a user question, or only during an offline process in which the deep processing of the question may be performed without any user waiting and storing the results for online querying. These questions may be questions that have been previously seen during online processing, or questions that log analysis indicates that they are frequently asked. Such an off-line only approach is a preferred example approach.
Questions are answered by executing a series of sub-queries generated by reasoning, which allow us to eventually find paragraphs that answer the questions. Some sub-queries occur frequently when dealing with different problems. Remembering these sub-queries and processing them more deeply offline typically allows us to answer questions faster and return better results (because we make more deep inferences about the sub-queries, giving us the opportunity to find more solutions). The output of this process is referred to as the thought result.
In an example, we store two types of information during the thinking result process: the results of the thinking themselves (i.e., solutions to the problem), and metadata about the results of the thinking, including a record of all our thinking of the problem and the frequency with which the results are used.
Thinking result storage
As previously described, three methods are used for problem handling: directly searching paragraphs which can be unified with the current sub-query; using a computing unit; and (5) reasoning. We store the thought results by storing intermediate paragraphs derived during the reasoning process, which can be found in the future by direct lookup.
For example, if we ask Question (Question X) (IsA X aeroframe), we can answer x=gatwick based on the following reasoning:
(IsA GatwickAirport Aerodrome)
reasoning paragraph (ConsequenceOf ((IsA A B) (IsSubclassOf B C)) (IsA A C))
(IsA GatwickAirport Airport)
Reasoning paragraph (ConsequenceOf ((IsA A B) (IsSubclassOf B C)) (IsA A C))
Known (IsA GatwickAirport InternationalAirport)
Known (IsSubclassOf InternationalAirport Airport)
Known (IsSubclassOf Airport Aerodrome)
We will store (IsA GatwickAirport Airport) and (IsA GatwickAirport Aerodrome) (both the final answer and the middle paragraph) as if they were normal paragraphs entered by the user or otherwise learned. If we are asked to the same question later, then the answer may come from a direct lookup (IsA X aeroname) without any reasoning. The same applies to sub-queries: if some other questions infer sub-queries (IsA X aeroname) or (IsA X air), we can use the thought results directly instead of further reasoning.
The thought results, although just like any other paragraph, may be stored in its own paragraph store separately from other paragraphs, so they can be easily identified and processed, including expiring appropriately.
Along with each new paragraph stored in the thought results, we can also store an interpretation. This is useful for reserving steps for deriving results. Without this, the explanation of x=gatwick for the above problem will be known only: (IsA GatwickAirport Aerodrome). Finally, we can also store dependent paragraphs: these are paragraphs (inferences or otherwise) that are used to arrive at the answer. If any of these paragraphs is updated or deleted, we delete the thought result because it may no longer be valid.
In the example, we also note that paragraphs are properly stored, only as they can reasonably be expected to be valid. Some paragraphs are only temporary reality, such as the price of the actively transacted commodity, or local time at a particular location; some paragraphs have a much longer validity half-life, such as political position holders, and some remain true indefinitely.
Question answering with thought results
To incorporate the above-described thought results into our question answering process, we query the thought results paragraph store when looking for paragraphs that can be directly unified with the current sub-query, as already described above. However, nothing presently described prevents the system from continuing to reason about sub-queries, including potentially reasoning about exactly the same results as those provided by thought results. We may need to know that we have previously thought about this sub-query so that we can avoid reasoning about it again. That is why it is preferable to store metadata about the thinking results as well.
Metadata
Not only store paragraphs such as (IsA GatwickAirport Aerodrome), we also record what we think about (isax aeroname). Before any reasoning about them, we examine the set of sub-queries we have already thought. If we find that we have previously thought about them, we disable the reasoning of sub-queries and only try to unify directly. To further expedite things, we can also record the number of solutions that are generated by processing sub-queries. If this is zero, we can avoid searching paragraphs that are unified with the current sub-query: we know that there is no.
Offline process
In a preferred example, the offline process uses very high reasoning efforts to run the problem and store the resulting thought results in a manner already described. Metadata is stored in the distributed memory cache; we store the number of solutions that the problem has been processed and when it has been processed and generated. Sub-queries generated at the time of reasoning are also added to the distributed cache.
The offline process may run continuously, selecting a question to be processed based on their click count (see below) and how long we last processed the question. Problems with low click counts that were handled for a prior period of time will be removed from the cache-or there is evidence that the results may have expired.
Online question answering
Before answering top-level questions outside of the offline process (i.e., questions that must be answered quickly), the preferred example consults the distributed cache to see if the question has been processed with thought results and is sufficiently close. If so, we deal with the problem without reasoning (i.e. we use the thought result). If the cache tells us that the problem resulted in no thought results, we return immediately without querying the thought results. In this case we use the return of high quality results almost immediately.
If the thinking results are not available (either we have not previously processed the problem, or we have processed it too long ago), we do so as usual (i.e. reasoning until we reach the reasoning effort threshold or terminate for other reasons). We continue to consult the cache of sub-queries and do not infer any sub-queries in the cache (if the cache tells us that there is no result, then database lookup is avoided entirely). If we find this cache hit sub-query, we do not adjust our inference budget in any way. In this case, the thought results do not improve performance, but they can significantly improve the quality of the results (if there are thought results for sub-queries).
Whenever we consult the cache, we record hits for the query. If the query is not already present in the cache, we add it, hit count is 1.
For space reasons, various examples may choose to periodically remove the underused or infrequently used or never used thought results, even though these results are still considered valid.
According to various examples, including the preferred example, the metadata may include all stored paragraphs, including the inferential paragraphs used to generate the results. In examples where paragraphs may be found to be unreal, invalid, or changed later, this metadata enables the results of the thought that depend on the invalid paragraph to be removed immediately. If the thought result uses another thought result to generate a result, then the dependent paragraph of the used thought result is included in the dependency of the new thought result.
Automatic monitoring
Various examples may utilize an automated monitoring process to determine the value of a paragraph stored in a paragraph store. The benefit of this technique is that the large amount of information represented in the UL is maintained scalable with little or no need for human supervision. The inference engine in the question processor then uses the value of the paragraph determined by this process to determine whether it should use the paragraph. This enables paragraphs to originate from sources of low quality or unknown quality, knowing that bad paragraphs will eventually be shut down or no longer be used. In other words, it enables the system to know which of its stored paragraphs are useful, real or otherwise valuable, and which are not.
When a new paragraph is added to the storage area by a person, it is assigned a low initial trust value when added by the average user. Privileged users or systems have learned that trusted users may result in higher starting values. The inference engine may then be instructed to be more experimental in processing the question, meaning that it may attempt to answer the question using a paragraph with a lower value. The answers provided by the experimental reasoning engine are then monitored for any signals that would indicate whether the low value paragraph has a positive or negative impact on the answer. This information is then fed back into the automated monitoring process, which re-evaluates the values of the paragraphs using the new signal.
Examples of signals used include the results of test questions with known good answers: when used, paragraphs that support or are compatible with these produce a positive signal for that paragraph, while those that lead to erroneous results or greatly slow down the production of good results lead to a negative signal. The signal may also come from real world results. Information from the user (i.e. something that the system has generated value) will send a positive signal to all paragraphs of all types used to generate the result. Also, poor feedback will contaminate all paragraphs used. Some good paragraphs may be unfairly contaminated, but over time they will also receive positive signals and it will be possible to determine the constituent paragraphs that are always sources of negative results.
Value vector
According to various examples, the overall value of a paragraph is a combination of factors that may vary depending on the system or process in which it is intended to be used and the context. For this reason, a value vector may be assigned to a paragraph, where each number represents a different quality of the paragraph. This allows us to have different dimensions for realism, usefulness and efficiency. Then, the process using the paragraph only needs to use priority vectors with numbers at each index that indicate their determined priority for the value, and then the overall value for the paragraph of the process can be obtained from the dot product of the two vectors. However, it is sometimes useful to use the values individually in some contexts, in which case our knowledge of the applicability of the score to that context can be used to optimize our use of paragraphs. For example, allocating an inference budget in a problem handler may be based primarily on efficiency scores.
Offline treatment and experiment
A further approach for automatic monitoring is to run a persistence procedure to reprocess the problems it sees in production at a higher experimental level to test if any low value segments might help find more answers. Any low value paragraph that helps to provide further answers may be lifted with a positive signal. According to other examples, the offline process may run test questions with known answers using the paragraph being tested and see if it would result in erroneous or irrelevant answers being found or otherwise having an unacceptable impact on the performance of the system (such as poor latency). This can be used to validate paragraphs that contain information, and can be used for paragraphs, such as inferential paragraphs. For production use, paragraphs determined to be detrimental to the test procedure may be ignored.
Learning
Examples of the invention, including examples of implementing any application described herein or other applications, may learn what they learn in the UL (or the like) and then utilize the stored UL to improve their performance. The learned paragraphs are stored in long-term or short-term memory and used to deliver the application.
This approach is in contrast to what is commonly referred to in the art as machine learning, where parameters or weights are learned that allow the model to perform statistically better at classification, regression, or other tasks. Examples may also combine language-based learning with statistical machine learning as described herein.
The learning described herein is not a weight, but rather is expressed in language and translatable into concepts and ideas of natural language, enabling examples of the present invention to infer with what they learn and interpret what they learn to human users. Learning also enables conversations with users in text or spoken language in ways that weights in the statistical model do not.
Sources/methods for learning in examples of the invention include:
(a) Learning from a dialogue or other natural language provided by a user: by translating natural language provided by a user in spoken or written form into UL and storing it, concepts, ideas and knowledge represented in the stored UL are learned and can be utilized.
(b) Learning from reasoning: UL generated from the inference chain may be stored and utilized. Inference can be directed to specific goals, such as answering questions or undirected thinking results designed to find a potentially useful idea.
(c) Learning from other natural languages. By translating all or part of a document (such as a web page, scientific paper, or other article) into an UL, the resulting UL can be utilized by the applications described herein. Other sources of natural language may include audio recordings or videos containing human speech, where speech recognition techniques are first used to create a text transcription of a recording of speech, which is then translated into UL. In some examples, the neural network may be trained end-to-end to directly convert the audio data to UL. For video, examples may incorporate knowledge of the content shown in the video, such as described by a machine learning model designed to analyze the content of the video and synchronize audio to better interpret the audio or augment the audio with additional information recorded in the learned UL.
(d) Learning from structured data. Structured data such as the contents of tables found in documents or on web pages, spreadsheets or relationships, graphs, or other databases. The structured data also includes a format that may be an output of an automation system such as JSON. Structured data can be converted to UL by assigning semantic nodes to identifiers in the structured data or relationships corresponding to relationships in a relational database, and generating UL corresponding to the meaning of the structured data.
(e) Learning from analysis of other data. Examples of the invention may analyze data, process the data with an algorithm, and express the results of the analysis in the UL. By storing the resulting UL, the analyzed and derived data is made available to the system in a form that can be processed and inferred as described herein. In some examples, the analysis may be performed using a machine learning model.
Distributed use. Semantic node resolution.
As previously mentioned, the preferred example enables any user of the UL to use any new ID-essentially a private ID-for any node. However, if the entity is in use elsewhere, it may be meaningful for the user to use the shared ID for the node.
To achieve this, the service of the preferred example provides the node with a shared ID according to the description of the node. This is referred to herein as Semantic Node Resolution (SNR).
To implement this service, it requires information in UL about existing semantic nodes that the service can return. This information will typically be public to the shared nodes, but may also be based on additional private information about the nodes. When invoking the SNR service, the caller provides a description giving information about the entity for which the semantic node is requested. In various examples, the description of this caller may be UL or it may be a natural language description-or a combination of both.
The SNR service then compares the known information about the entity being described with the description it has about the existing node to see if it can confidently match the new node with the known node and thus provide the shared ID.
To this end, the SNR considers potential matches and then attempts to estimate the probability that they are two different nodes. Beyond a certain threshold probability, e.g., 0.999, a shared node is provided. In various examples, possible matches and their probabilities may be returned, enabling the caller to decide on its own whether to use the shared ID or the new ID.
Probability calculations are used by combining probabilities from various parts of the description.
For example, assume that the unknown node is a human, named "William" and surname "MacDonald", and the birth date is 1953-04-02, and the birth country is Ireland. By means of the date of birth matching alone, resolution is not possible because thousands of people share the same date of birth, but combining this with a shared birth country and a shared name, the probability that they are the same node becomes very high and the use of a shared ID becomes reasonable. For human embodiments heuristics and data would be included to estimate the probability that any person has a particular birth date or a particular name, combine these probabilities, and then compare them to the totality of possible entities in the category. These calculations can be used to estimate the probability of matching the uniqueness.
Note that some probabilities may be considered independent and multiplied, while other probabilities are not independent and thus need careful combination. For example, it is half as likely that a woman is, since the number of men and women is approximately equal. The name Jane significantly reduces the possibilities because only a small percentage of people call Jane, but knowing that the node has the name Jane and that it is female gives little additional information beyond Jane because almost everyone named Jane considers itself female. There is also a more subtle non-interdependent relationship. For example, the probability of a name varies greatly from country to country.
The direct use of SNR is to provide a shared ID, which can then be used by the calling user with confidence. In some cases, the confidence level may not be sufficient to use the shared ID immediately, and the caller may prefer to use a new or private ID until more information is known to be able to make a match. SNR can also be used after writing a paragraph with one or more private IDs in order for subsequent steps of rewriting the paragraph to replace the one or more private IDs with a public ID. It can similarly be used to merge common ids that mark the same entities. When they are first used, they may not be identified as identical.
Multiple UL storage regions
Examples of the invention enable paragraphs to be stored in multiple separate storage areas. The memory areas may be used for different purposes and may have different access controls. Different storage areas may also have different trust levels. For example, various examples may maintain access to UL store containing highly trusted paragraphs representing common sense information about many widely used semantic nodes, as well as useful and non-controversial inferences. The other UL storage area may contain employees and human resource records of a private corporation and severely restrict access to selected persons within the private corporation. In some examples, these limitations may apply to tissue. The UL representation of a particular natural language book may be given its own memory area.
According to various examples, UL storage may be shared among multiple users. For example, a trusted core UL store of widely used semantic relationships may be widely available. UL representations of critical storage areas of books or knowledge may be commercially licensed by the organization that built and maintained them.
Translation
Translation is the act of converting the UL into and from natural language, with the intent of communicating with humans, learning, understanding information stored in natural language, and other intents.
Neural machine translation
Neural machine translation is a term used in prior art methods for translating between pairs (or more) of natural languages using neural networks. A typical architecture includes an encoder that converts a source sentence into an internal vector or sequence of vectors that encode the source sentence, which is then read by a decoder that generates a corresponding sequence of words in a target language. Variations of this architecture use recurrent neural networks, including Long Short Term Memories (LSTM), various attention mechanisms, and the nearest transformer. Such an architecture may be considered a conditional language model, in which the output of the language model is constrained by the source language of the input.
Examples of the present invention utilize neural machine translation architecture, but they do not use only natural language, but rather utilize neural networks that have been trained with translations between natural language and UL (or the like). The resulting neural network may then generate an UL corresponding to the meaning of the input natural language. In the preferred example, the vocabulary includes semantic nodes and left and right bracket symbols.
Importantly, in contrast to a neural translation system that translates between natural languages, a neural architecture designed to translate between natural language and UL can be considered a system that understands natural language as a resulting UL (or UL analog) representation that fully represents semantics from natural language and is machine processable. This UL can then be used for reasoning, question answering, and other actions and applications, as described herein. In a machine translation system between natural languages, both source and target translations present machine understanding problems with all natural languages, which have been previously described herein.
The bundle search is a method that, instead of just reading the most likely output symbols from the decoder at each step in a neural machine translation system, maintains a series of possible outputs from the decoder and their probabilities, which can be used to generate a list of possible translations. Examples of the present invention that can verify UL use a bundle search and remove invalid UL translations from the results to ensure that the generated UL is meaningful. Automatic verification of UL may also be used to ensure that the system is trained using only valid UL. According to various examples, the verification section (as described herein) may be used for automatic verification.
Other translation methods/alternative examples
One method for translating to and from the UL used in some examples is by looking at the UL paragraphs that have been marked as "substantially true translations". These are known translations between UL paragraphs and paragraphs written in natural language (such as english), assuming they are accurate. In some examples, these may simply be stored in a database listing the basic true translations for the corresponding UL paragraph. In some examples, the translation itself may be stored as UL, such as:
(EnglishGroundTruthTranslation(IsA Brie((Cheese French)Creamy))"Brie is a creamy French cheese")
This means that "bri is cream-like French Cheese (Brie is a Creamy French Cheese)" is an accurate english translation of (IsA Brie (chese French) Creamy). If we refer to this paragraph as GroundTruthTransaction 1 and we also have (IsA GroundTruthTranslation GroundTruthTransaction) in the storage area, we can use this known "correct" translation as the basis for other similar translations. Using the above method, perfect translations can be generated whenever there is a precise match to and from the UL, such as "brix cream french cheese (Brie is a creamy French cheese)". These paragraphs may also be used to translate things that do not exactly match. A simple example of an inexact match may be the english paragraph "karman bell is cream-like french cheese (Camembert is a creamy French cheese)".
The method used in some examples depends on the direction we need to translate. When translating from natural language to UL, we decompose the structure of a given sentence and compare it to the structure of each of the known basic true translations to rank in similarity. Sentences are divided into words (or other atomic parts of the language) and then recombined into sub-parts (word sequences) that we have an existing translation, such as (ExpressedInEnglish Camembert "camera") and (ExpressedInEnglish IsA "is a"). These two paragraphs will mean that the Cammbert node becomes an option for the "Cammbert" part of the sentence, and the IsA becomes an option for the "is a" part. When "carbobel is a creamy french cheese (Camembert is a creamy French cheese)" matches the groundtruthtransition 1, the translator will give a high similarity score because most sentences are identical and the only different parts ("camebert") have the same speech part as "Brie" and have a very "similar" choice (cambert) in the list to the nodes used in groundtruthtransition 1 as Brie. In a preferred example, the similarity of these two nodes is compared using a component of the UL platform called an entity solver.
According to various examples, entity solvers currently work by comparing a large number of paragraphs for two nodes and determining how similar they are in use. If they are used in a nearly similar manner, then they are likely to be very similar nodes, and direct substitution in translation may be accurate. For example, they may belong to the same class, so we can see (IsA Brie Cheese) and (IsA Camembert Cheese), which are all identical except for the nodes we are comparing. In other examples, the entity solver incorporates further heuristics or characteristics to determine the similarity of two given nodes.
Translation from UL to english again uses the entity solver component to compare the UL we are translating to a known basic true translation and pick the most similar one. The different nodes then replace their translations to form the final output string.
Word embedding, such as word2vec or GloVe, is a technique known to those skilled in the relevant art in which a large amount of text is analyzed to determine words with similar meaning and usage. Various examples use this technique to determine the similarity of natural language words and their suitability for substitution in known basic true translations. For example, analysis of English would determine that Kazakhstan (Cammberert) and Brie (Brie) are very similar items, as their word embeddings would be very close to each other. This means that a basic real translation comprising Brie will almost certainly replace the carbobel (cam bert) with the word Brie and exchange semantic nodes in the UL half of the translation.
Another technique used by various examples involves automatically evaluating the semantic impact of changes in natural language expressions. In natural language, there are often things that are the same in many ways. Sometimes overwriting results in another paragraph having the same meaning, in other cases the change in semantics is small. In other cases, the semantics vary widely. Using an automatic method of evaluating semantic impact between two natural language paragraphs, when the semantic impact is evaluated to be small or absent, a basic true translation may be used even if there is no exact match to the natural language in the basic true translation.
Examples of techniques that may be used for semantic impact assessment would be replacement of words that are known to be synonyms or words of similar meaning, other ways of overwriting known to express the same thing (e.g., talk in english "< no 1> of < no 2>" and "< no 2>'s < no 1 >"), and addition of filler words that change meaning only subtly when used.
According to various examples, the translator uses a pipeline in each direction, where each pipeline is a function that receives a document structure and returns a new document structure. Each conduit may add components to the document for use by later conduits. The final document contains the part, which the pipeline treats as an output part, and the pipeline returns the part as an output. The first pipeline in each pipeline is used to look up directly in the cache, and if this returns a successful translation, the remaining pipelines can be skipped. In the case of english to UL, we then run a series of pipes using the stanfordcore nlp library or the like to tokenize the sentence and tag it with a portion of the speech information that can be used to help the basic true translator pipe determine the best match in the later pipes. When translating from UL to english, the only pipe used before the basic real translation is the direct-lookup pipe, since UL itself should already give enough semantic information to be translated back to natural language.
The optimization that exists in various examples is the use of bloom filters to identify language paragraphs that are not present in the translated memory area to reduce the load on the system. Bloom filters are space-efficient probabilistic data structures that are used to test whether an element is a member of a collection.
According to various examples, a translator will actively attempt to change the translation of natural language between a broad range of semantic equivalents and natural translations to create diversified and fresh speech to benefit users of the products powered by the present invention. For a substantially true translation, this may be done by randomly selecting between multiple translations of the same UL. Other techniques described herein either naturally generate multiple candidate translations as well or can be readily adapted to do so.
Translation context
UL may encode self-pointed and meta-language statements: how and when translation is to be performed, and what is more appropriate in one context than in another, can be described in the UL. For some applications, it is desirable to have the ability to generate translations that are particularly suited for the current context. Some examples utilize methods to express descriptive context information in the UL about semantic nodes that may be translated. By making this information available to the system at runtime, program reasoning can select the most appropriate attributes (e.g., "vehicle", "car", "my Audi") from the available attributes, which can then replace the original node in the preprocessing step. Other translation techniques may then be applied to render it in natural language.
Translation between natural languages
The purpose of the UL is to fully represent the meaning of any natural language, and that language can be easily extended to accommodate nuances and new concepts from new languages that may not have been created before. This means that no content is lost once a document or a piece of natural language is translated into UL, and UL translation may contain all the semantics and nuances of the original. This is in contrast to a single natural language, which is naturally an imprecise translation of the source language, where the meaning of the words is not exactly the same, or even the words are not present in the target language.
For this reason, an improved approach for translation between multiple languages is to build a translator from UL and to UL for each natural language and translate between natural languages by first translating the source language into UL and then translating from UL into the destination language.
Prior art translation systems, such as neural machine translation systems, typically learn from examples of text between pairs of natural languages-or direct effort and resources to specific language pairs. This means that for n languages you will need a translation system on the order of the square of n to accommodate all language pairs. Since UL is an effective intermediate language, you will need the number of systems built to be 2n—for each language UL- > NL system and NL- > UL system.
Representation of emotion, connotation, etc
Words in natural language typically have connotations or inferences that extend or are appended to their pure semantics. For example, in English, there are a number of synonyms for the word "error" (such as "fool (boo-boo)", "error (screen-up)", "inaccuracy" (error), and "miss)", etc. Although considered synonymous, these different words have different connotations and usages. For example, the word "fool (boo-boo)" is used for children or jeers; the word "inaccurate" expresses politics or relatively small errors; "error" means a major error made, where there is considerable responsibility for the person responsible for the error.
According to various examples, these connotations and usages may be represented by having different semantic nodes corresponding to each of these concepts. The meaning of containing these connotations can be expressed in other UL's that relate these semantic nodes to similar but different nodes while also accounting for differences.
The use of nesting can be used to represent moods and moods in writing as well as other characteristics such as nuances or formalities. For example, a semantic node representing anger intonation may be combined with a paragraph being spoken to represent a paragraph delivered in anger intonation. In a speech example, such representations may be used to modify the generated sound to properly include emotion in the output.
Specific application program constructed for example of the present invention
Recruitment application
An automatic recruiting application is an application that attempts to automatically find highly qualified candidates for a given job position: the representation of the resume of the potential job seeker is matched with the representation of the role and job specification. Further examples of such applications may also match the desired roles of the job seeker with the job description and evaluate how well the job seeker qualifies.
There may be hundreds of thousands of possible candidates online that may match a given job. Recruitment prior to the present invention is typically performed by a human using tools to search a database of such candidates. Typically, such searches are either purely based on unstructured data—searching for keywords in their resumes, sometimes in combination with limited structured data. For example, job hunting, which is specialized in finding software engineering talents, may include structured data in common programming languages, and the search box may include drop-down menus or check boxes for these specific skills in addition to allowing a search keyboard. Such limited structured data is created in a conventional manner using database schema and specific code to include this data in a search.
In addition, some applications attempt to apply prior art NLP technology to both resumes and the job specification, and then attempt to rank job applications according to their statistical degree of match with the specification. The limitations of prior art NLP will mean that the ranking is only approximate and is largely affected by similar keywords in both the resume and the job specification. This will provide some value to recruiters who want to narrow the resume list, but will still require a significant amount of human involvement to have a high confidence of a good match, and thus the recruiter may not see a good match.
In practice, the number of different skills or experiences that may occur in a resume or job specification is very large, meaning that any structured data that is decided and incorporated into the system can only cover a small portion of the content that may be searched.
Thus, automatic recruitment is an example of a HUB application.
An example application called Jobe is described herein. It represents a preferred example of a recruitment application and other related examples.
In a preferred example, most job specifications and candidate resumes are represented in UL or the like, and the reasoning methods described herein are used to determine whether they match. In various examples, UL or similar representations of at least some of the candidate's goals also match the job specification and possible employer descriptions.
FIG. 1 shows an example of a push notification from Jobe's cell phone informing the user that a new candidate matches a lot of their work. The matching occurs automatically. If the match is statistical or using existing inexact methods, the application designer will have no confidence in interrupting the user with this message, as the match will typically be very poor. Because the technology in this application is based on an example of the invention, it is known that automatic matching is good with very high confidence and thus interrupting the user is a good product experience. Thus, in an example, the present invention enables a heretofore impossible product experience.
Fig. 2 shows an example of detailed information of matching in which the requirement of a character coincides with data of a resume from a candidate. FIG. 2 illustrates three example work requirements for primary software engineer positions: experience of the language that is mainly object oriented, fluent foreign language and its location. Specifically, jobe matches the requirements of the "major object-oriented language over 1 year" with the "3 year C++ programming"; portuguese fluency is inferred from the fact that the candidate is in the high school that Brazil (where the dominant language is portuguese) read, and she "within commute distance of London" is inferred from the fact that the candidate resides in Highway, heteford county, UK.
In these examples, no evidence from the resume they match shares any keywords with the actual requirements. All three matches also require reasoning using UL representation and come from the semantics of requirements and working specifications.
It is further illustrated here how one of these matches is done using UL:
as described herein, examples of the invention may answer yes/no questions. To match similar candidates, the system asks the question to itself, "7 years of c++ experience suggests programming experience in the main object-oriented language for at least five years? "
"C++ experience of 7 years" can be expressed in the UL as ((Experience CPlusPlus) (Year (RealNumber "7")) -experience is combined with another semantic node to express experience with this concept.
"programming experience of a predominantly object oriented language for at least five years" can be expressed as
((Experience(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major))))(Year(AtLeast(RealNumber"5"))))
AtLeast in combination with a number (number) gives a range of numbers. Thus, a unit range is given in combination with a unit. In this case the least years. The unidentified member represents a member of an unidentified class.
Thus, the entire problem can be expressed in UL as follows:
((Question)(Implies((Experience CPlusPlus)(Year(RealNumber"7")))((Experience(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major))))(Year(AtLeast(RealNumber"5"))))))
To answer this question, we use the following trusted paragraph from the UL storage that represents recruitment related information. According to various examples, these paragraphs are generated by a conversation between the system and the person, translating natural language into UL, or directly added by a trusted person, constructing a recruitment application.
(IsA ProgrammingLanguage Class)
(IsA ObjectOriented Attribute)
(IsA Major Attribute)
(IsA CPlusPlus ProgrammingLanguage)
(IsA CPlusPlus((ProgrammingLanguage ObjectOriented)Major))
(IsA Year Unit)
In addition to these known paragraphs, a calculation unit and an inference paragraph are also required. The calculation unit for comparing whether 7 is equal to or greater than 5 is defined as follows:
(ComputationInputs GreaterThanOrEqualComputationUnit InputOne InputTwo)
(ComputationDescription GreaterThanOrEqualComputationUnit((Question)((GreaterThanOrEqual(RealNumber InputOne)(RealNumber InputTwo)))))
(ComputationLocalJavaLocation GreaterThanOrEqualComputationUnit"ai.unlikely.questionprocessor.computation.Comparison$GreaterThanOrEqual")
and the required reasoning paragraphs for this problem are:
core reasoning paragraphs to help define IsSubclassOf relationship
(ConsequenceOf(IsSubclassOf AB)(IsA(UnspecifiedMember A)B))
(ConsequenceOf((IsSubclassOf X Y)(IsA A Attribute))(IsSubclassOf(X A)Y))
(ConsequenceOf(IsAX Class)(IsSubclassOf X X))
Inferences about quantity, activity, implications, etc.:
(ConsequenceOf((Implies X Y)(QuantityWithinRange A B))(Implies(X A)(Y B)))
(ConsequenceOf((IsA X ProgrammingLanguage)(Implies(Experience(Programming X))Z))(Implies(Experience X)Z))
(ConsequenceOf((IsA X Activity)(IsA Y Activity)(Implies X Y))(Implies(Experience X)(Experience Y)))
programming a programming language is an activity
(ConsequenceOf(IsA X ProgrammingLanguage)(IsA (Programming X)Activity))
If X is a programming language and is a member of class C, programming X means programming a member of class C
(ConsequenceOf((IsA X ProgrammingLanguage)(IsA X C))(Implies(Programming X)(Programming(UnspecifiedMember C))))
If A is within range B and X is an arbitrary unit, then A X is within range B X
(ConsequenceOf((IsA X Unit)(WithinRange A B))(QuantityWithinRange(X A)(X B)))
If X is greater than Y, X is within at least Y
(ConsequenceOf(GreaterThanOrEqual X Y)(WithinRange X(AtLeast Y)))
Using the question answering methods described herein, a "Yes" result may be generated for the question.
To further illustrate the method, the following explanation shows steps that may be generated using some examples of the inference methods described herein:
(Implies((Experience CPlusPlus)(Year(RealNumber"7")))((Experience(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major))))(Year(AtLeast(RealNumber"5")))))
(Implies(Experience CPlusPlus)(Experience(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major)))))
Known (IsA CPlusPlus ProgrammingLanguage)
(Implies(Experience(Programming CPlusPlus))(Experience(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major)))))
(IsA(Programming CPlusPlus)Activity)
Known (IsA CPlusPlus ProgrammingLanguage)
(IsA(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major)))Activity)
(IsA(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major))ProgrammingLanguage)
(IsSubclassOf((ProgrammingLanguage ObjectOriented)Major)ProgrammingLanguage)
(IsSubclassOf(ProgrammingLanguage ObjectOriented)ProgrammingLanguage)
(IsSubclassOf ProgrammingLanguage ProgrammingLanguage) known as (IsA ProgrammingLanguage Class)
Known (IsA ObjectOriented Attribute)
Known (IsA Major Attribute)
(Implies(Programming CPlusPlus)(Programming(UnspecifiedMember((ProgrammingLanguage ObjectOriented)Major))))
Known (IsA CPlusPlus ProgrammingLanguage)
Known as (IsA CPlunsPlus ((ProgrammingLanguage ObjectOriented) Major))
(QuantityWithinRange(Year(RealNumber"7"))(Year(AtLeast(RealNumber"5"))))
Known (IsA Year Unit)
(WithinRange(RealNumber"7")(AtLeast(RealNumber"5")))
Calculation (Greater Than OrEquat (RealNumber "7") (RealNumber "5"))
Various examples may generate natural language translations of these steps, either showing a complete example of each step, or skipping obvious common sense steps to create a simplified interpretation that is easier to understand. Fig. 7 shows an example of such an interpretation generated by various examples.
Horizontal health application
A horizontal health application is an application that attempts to record and manage extremely diverse health data sets from one or more users. As discussed herein, it is impractical to represent this data in a manner that a computer system can understand and manage using prior art techniques prior to the present invention.
Nutrition: there are millions of different foods and millions of different consumable substances. These substances are sometimes related (e.g. type of fat) and they have many different properties. The interactions of these substances together are very complex-and the deep semantic representation of nutrition can allow computer-based systems to give very complex dietary advice and unlock interactions that were not observed before. Thus, nutrition is a HUB application.
More general health is also an example of an unreasonably broad field. Nutrition is a sub-area of this extremely broad area. Applications that track personal daily health information need to incorporate a large number of health tests and include areas such as: the level of certain substances in the blood, organ measurements, body composition measurements, body performance measurements in various fields, activity information, nutrition information, genetic data, microbiome data, sleep data, specific events affecting health (exercise, consumption, drinking substances, mood, bowel movements) and numerous recorded health conditions and diseases. Any of these types of data may be correlated with other data and with the patient's health objectives. While a small subset of these can be built using the typical local solution we see today, it was impractical to build a large general-purpose health application potentially containing all of this information prior to the present invention.
"Chea" is an example application from such applications described herein. Which represents a preferred and other examples of the present invention.
In addition to logging health data from wearable devices and other health sensors, chea has a chat window in which users can communicate their occurrence of health related events and let applications understand, store and process those events. An example health event may be a nutritional event: consuming food and beverage. The window may also optionally be used to record other health events: specific symptoms of the disease, information about mood and energy levels, bowel movements characterized on bristol stool charts, and the like.
Fig. 3 shows a conversation within an app in which nutritional data is being transmitted with the app. In a preferred example, the conversation is between the user, the AI, any number of human medical professionals, and the human nutritional technician, who derives semantic nutritional data from photographs and descriptions of foods and beverages entered by the user when consuming the foods and beverages. AI can handle other tasks that a human can accomplish in situations where automatic processing is not possible. An example of an automated task would be to prompt the user when no nutritional information or other health information is added over a period of time, assuming periodic provisioning. For example, if no food or beverage is entered for a long period of the day, the AI may ask the user if this has actually occurred or if they have forgotten. If the user intends to record detailed information of the bowel movements, they may also be prompted if an abnormally large gap occurs and there is no shared information. The semantic nutritional data not only represents exactly what and when was consumed, but also represents uncertainty-e.g., uncertainty about the size of the portions from the unaware of the exact cutting of the food to the image.
UL supports this uncertainty. For example, it may not be possible to determine what kind of cheese is shown in the image, and the user may not know it, but in this case the semantic nodes of the cheese may be used. If there is a more accurate cheese type in the recipe, such as cheddar cheese, or even a very specific type of cheddar cheese, then the appropriate semantic nodes can be used for this. Paragraphs in the trusted memory area represent the relationship between cheeses and a particular type of cheese, as well as many relevant information about these nodes.
UL may also represent information about possible constituent substances of a food product. For example, if the image is part of a chicken patties, UL may represent the ingredients of the chicken patties, including the patties skin and typical ingredients of the patties portion. Uncertainty of the exact composition and variation of volume of these ingredients can also be indicated in UL. These uncertainties and variations combined with uncertainties in the illustrated portions (as communicated by the nutritional technician) may be combined to include detailed semantic decomposition of the meal with uncertainties, and it may be recorded. Since this nutritional data is purely semantic and the application also has relationships and other information about these substances represented in the UL, the data can be viewed in a number of different ways and measured and plotted through a number of different shots to derive health insights.
Fig. 4 shows some example insights that may be derived from a time period showing horizontal health and nutrition data. By combining data from the wearable device (including pulse and sleep data and potential other events and mild illnesses recorded in the dialogue data) and correlating these negative events with the intake of certain substances, the app concludes that the user may be slightly intolerant to sulphite. With this information, the app can help the user get away from the food containing this substance in the future. The second example insight provided in this figure is the strong relationship between earlier eating and sleeping better found for the user. Sleep data from the wearable device is available as a horizontal health application and can be compared over a period of time to nutritional data including the time it spends. With enough data, the insight can be shared explicitly with the user, who can then improve their sleep, and thus their health, by planning to eat earlier than before. Such insights would not be possible without storing extremely extensive health data and semantically accessible to the machine so that these automatic insights can be generated.
Fig. 5 shows an example graph showing daily intake calories versus dissipated calories, which is a very common thing for some people to track if the user aims at weight loss (or gain). Detailed semantic information about what the user consumed enables the graph to show error bars, giving an accurate range of calories ingested on a given date. The wearable device measures the physical activity of the user, in combination with data about their weight, enabling accurate estimation of calories consumed during the day, with error bars. Unlike other applications that do not estimate calories with error bars, this approach can more accurately predict days in which a user is likely to lose weight by also identifying days in which two measurements are too close to be considered bad or bad (error bars overlap). This is a better approach than other applications that give erroneous accuracy, since the calorie measurement is inherently error-prone and it is entirely possible to reduce hundreds of calories when assessing food intake, giving the user the false impression that they are in bad calorie balance, when in fact they are not.
FIG. 6 illustrates another example of a visualization that can be generated from an example of the present invention. It compares the estimated caffeine in the user's body when they go to bed with the calculation of sleep quality. The sleep quality metrics are from one or more wearable devices and are calculated by combining various sleep measurements. Pre-sleep caffeine is derived from nutritional data collected by the app. For example, a cup of coffee consumed at 2 pm may have an estimate of the caffeine therein, and by assuming a half-life of the user's ability to metabolize the caffeine out of the body, it is possible to estimate how much of the caffeine remains at their known bed sleep time. In other examples using more complex decay rate models, the user's weight and DNA may be used, as certain genetic information is known to affect the extent to which the body metabolizes caffeine and other factors (such as food consumed), which in turn affect absorption in the stomach. By plotting sleep quality against estimated caffeine, users can see that their caffeine consumption does appear to affect their sleep and can therefore plan to consume less caffeine or consume it earlier in the day.
These graphs and insights are examples. The nature of a horizontal health application like Chea is that almost unlimited insight can be found from the data. The preferred example would search for correlations between the collected data, with known assumptions about both. For example, many different causes of diarrhea are known, but by forming assumptions for each cause and looking at whether the health data strongly indicates that this is the cause for the user, insight into the cause can be revealed. In the example of fig. 4, the insight may be that sulfite is intolerant, especially if other known symptoms (such as urticaria or flushing) are recorded in a time frame associated with the ingestion of the substance. Without such data, the user may have intolerance and never be able to establish a link. Even without known assumptions as a basis, further examples may reveal very strongly correlated insights.
Accounting
An example of general structured data is accounting data, the form of which has not been changed for a few centuries to a great extent. Accounting represents information by treating all transactions as a set of "borrowers" and "lenders" that match a limited number of ledgers. These ledgers (ledgers) represent a broad semantic category. Several centuries ago, these ledgers were realistic paper books and borrowers and lenders were recorded on paper. For computers, these ledgers are now semantic categories, and the borrowers and lenders are database entries identifying "ledgers". However, most of the semantic information associated with these transactions is still natural language.
In these systems, the computer system does not know the meaning of the ledgers nor the real world meaning of the transactions within the ledgers. Structured data does enable many common accounting reports to be generated immediately, but many questions of a data question may require extensive manual inspection of the natural language associated with the transaction: written description of the transaction, original invoice and ledger name. If such semantic information in natural language is semantically sufficiently represented, more questions can be automatically asked about the data and more reports can be generated.
For example, acceptable accounting standards vary from country to country. The company's account may be compiled with one accounting standard and then it may be difficult to review the numbers again with a different assumption. However, with sufficient machine-readable semantic information, such alternate views of the account may be automatically and almost immediately generated.
Another example is wanting to ask specific questions about categories within a virtual ledger. For example, a ledger of "consultation" costs may include costs associated with marketing of several different products and costs of consultation related to recruitment. They are separated only if it is expected to be necessary to do so before bookkeeping is completed, at which point separate ledgers can be created for different classes of transactions. Later attempts to do so would require a person to examine the transactions in the ledger and separately aggregate the different categories.
However, with the detailed transaction represented in the UL, this task can be done automatically by the application because there will be enough machine-understandable data for the machine to perform the task. According to various examples, this is done by creating new virtual ledgers at a later time and automatically assigning historical accounting transactions to them without human effort.
More extensive use of UL within human-machine interface
As we show herein, UL or similar representation is valid as a generic representation of an automated system and may represent actions or information provided by a human user to a machine. Any language-based human-machine interface (verbal or written) can be translated into an UL and the UL provided to the machine.
In addition, the non-verbal interface may also be associated with UL or similar representations of various human actions, providing specific representations of human intent for the machine. For example, components of a typical Graphical User Interface (GUI): buttons, menu items, etc. may each have a paragraph of the UL associated with them that represents the action or intent associated with activating the GUI element, and when a human user clicks on or otherwise activates the element, the corresponding paragraph or version of the paragraph (including possibly other associated data) describing the action taken is sent to the associated computing system for action.
Searching and analyzing documents or web pages.
As described herein, UL or similar representations may be translated into or out of natural language.
The UL motivated search system includes one or more document storage areas and provides an interface for one or more human users to query the document storage areas. Using the search system motivated by examples of the invention, at least a portion of the documents in the document storage area have been automatically translated to UL and at least some of the user's queries have been automatically translated to UL, and the system responds to the user request by utilizing the translated UL.
In a web search system motivated by an example of the present invention, a document storage area includes pages from the world wide web that are indexed and then at least partially translated into UL. Translation may include converting the natural language components of these pages to UL or converting tables or other structured data to UL.
According to various examples, the answer to the query may include a link to a web page containing the information being searched or providing the service being searched, or the system may provide the information directly in the form of a text or spoken answer. According to some examples and in some cases, such a direct response may be accompanied by a link to the source of the information and include associated data, such as an image or a table.
In the event that such search systems are unable to fully translate a document or web page into UL, existing keywords or prior art-based searches may be used to supplement or fail-over as a response to the generation of UL.
Map data, represented as UL, associated system utilizing map data and location-based search
Map data represents information typically found in machine-readable forms of maps. It also includes additional data, including metadata. It is used in map applications where people need to find directions. It is also used by automated systems that utilize such data, such as autonomous vehicles.
Map data may be expressed in UL as described herein. Map applications and automated systems using map data may be improved with examples of the present invention by having at least a portion of their map data represented in UL or the like and utilizing the techniques described herein for querying and reasoning using the representation. Some examples may use UL query remote systems to further augment their capabilities, such as by querying remote UL stimulated systems using data from a local UL storage area or data sensed or discovered from its current geographic location.
Identifying related advertisements and news
By having available information about the user, represented in UL or the like, examples of the invention are able to find relevant items to display to the user. These related items may be advertisements, news articles, or other items of information that may be of value to the user or the publisher of the item.
The UL representing information about the user may be from some or all of the information translated to be contained in the user's social media profile, posts, profile information, "likes" and the like. It may additionally or alternatively come from translating some or all of the user's web search or web browsing history into UL or the like. According to various examples, it may additionally or alternatively come from a natural language dialogue/exchange between the user and the system, where the system stores and remembers information about himself or herself that the user has given.
The UL associated with the relevant item may be from a translation of the natural language associated with the item, for example in the case of a news article it may be from an automatic translation of some or all of the news headlines or news article content. In the case of an advertisement, it may be from natural language text in the advertisement, text found on the click-through destination of the advertisement, or a translation of the results of an automatic image recognition system, where the content of the image is then translated into UL or UL semantic nodes. For some systems, the UL may be manually associated with the item. For example, a publisher of a news item may include such semantic representations of a news article as part of the disclosure process.
For example, analysis of a user's social media profile may cause the system to know that the user is an enthusiast cyclist and record this information in the UL. Items related thereto may include advertisements for bicycle related products, news items related to bicycles, and the like. The inference capabilities described herein will enable more indirect and accurate matching than prior art keyword-based systems. For example, a news article about a triage event occurring near a user's residence may use knowledge of the UL internal representation to infer that the user is interested in, i.e., that the triage includes a bicycle component, even though that component is not explicitly mentioned in the article. Advertising of nutritional supplements to alleviate muscle soreness after exercise training can be inferred to be relevant to the enthusiastic cyclist (their social media posts show their training difficulty) by semantic representation of the value and use of the inference chain and supplements derived from training. The system motivated by the example of the present invention can build this link with high confidence and without any keyword or text similarity, in contrast to prior art methods that require similar keywords and that have a lower confidence of statistical relevance than systems with semantic understanding.
According to some examples, where relevant advertisements have been inferred to provide a match, a user may be told why the advertisement or other item was shown to them. They were explained.
The hybrid system may combine prior art keyword or text analysis matching with analysis or matching based on examples of the present invention, e.g., using UL when available and using it to replace or augment the results based on prior art methods.
Aggregation and summary of news
In a system that translates news items into UL, examples can identify common information originating from different articles and present the common information to users as a summary or aggregate of different sources. Examples of personal information in the UL about the user may choose and adapt which news to share based on known personal knowledge about the user. Such personal information may include their interests, their locations, their employers, the industry they are engaged in, and other personal information related to what news they will find relevant or interesting.
Matching between people using UL
UL can be used to match between people by correlating their profiles with information about their UL representations and using the reasoning and matching techniques described herein to draw conclusions that they are matching. Various examples may choose to use the methods described herein to explain the inference process. The associated UL or UL-like information may be from automatic translation of some or all of the natural language present in their profile. Examples of the invention may also choose to interact with users via a dialogue and record their responses in UL to generate the UL. It may also come from the results of recording machine learning models in UL-e.g. predictions of user attributes, from image recognition of content of photographs and videos posted by users, or from transcription of audio data associated with a profile and subsequent translation to UL.
Matching of people achieved by examples of the invention includes suggesting potential "friends" in a social media application, suggesting potential business contacts in a job related social media application, or suggesting potential appointment objects within an appointment application.
Identifying abuse or unreal posts in social media
Many social media applications need to identify abuse-like posts, and many applications operate on a scale where human recognition of such posts is impractical. Accordingly, it is desirable to automatically identify all or most of such posts. Abuse posts may include posts or media that are ethnic or otherwise offending users, describing offensive, illegal, things that have national security or criminal impact, breaking intellectual property, propagating false information in a destructive manner, and defaulting or otherwise destroying rules of applications or websites in which they appear.
By associating the UL with posts representing its content, such abusive content can be automatically identified in a manner that is superior to prior art methods. For example, a post may not have any keywords that identify it as abuse, and may require reasoning to identify it as abuse. UL represents semantic information and the techniques described herein can be used for reasoning.
Examples may also identify posts as abuse by comparing UL associated with the posts to UL representations of site rules using techniques described herein for matching actions to rules.
UL associated with the post may come from techniques including partially or fully translating natural language in the post to UL using techniques described herein or otherwise; recording machine learning or other model that has processed the post as UL-e.g., classifying the post or identifying non-text content in the post-such as content of images, video, or audio.
Examples of the present invention may also combine existing prior art with UL or similar analysis to identify abuse posts. For example, by using UL technology if available and prior art if not available, or by combining the signal(s) from the positive result(s) from the UL and the positive result(s) from the prior art into a total score, and using that score in deciding to take an action. Actions include hiding the post or asking it to the attention of a human print master.
Examples of the invention may also generate a natural language interpretation of the analysis that determines that the post is abusive. Such natural language interpretations may be transmitted to the sponsor of the post, as an explanation of why the action was taken, or as part of a request or alert to the sponsor of the post, or to a human composer to help them understand what problems the post may have.
Analysis of customer reviews
Comments are written descriptions of services, products, and companies in natural language by users who experience these services, products, and companies. By translating some or all of these comments into the UL, a system utilizing customer comments may use the comments used in the UL for various useful purposes, including: (a) Answer questions from other customers about services, products and companies, wherein at least a portion of the information required to answer the questions is represented in the UL translation of comments, including situations where reasoning or reasoning is needed in combination with information of other UL representations; or (b) more generally answer questions about other products, services or businesses, where information in the comments aids in generating the answer, or (c) other types of automated analysis of the specific products, services and businesses of the UL description.
Shopping queries and product requests
In addition to reviews, other sources of shopping-related information that may be represented in UL or the like include (a) written product descriptions, e.g., originating from the manufacturer or vendor of the product, and (b) structured data in a product database.
By representing such information in part or in whole in the UL or the like, the techniques described herein can be used to automatically answer questions related to the product. The automated purchasing assistant may also have a dialogue with the potential customer, answering questions and elucidating what the customer is looking for before providing a recommendation for the product.
In other examples, shopping recommendations may be passive-delivered to customers, not responsive to questions or searches from customers, but responsive to other information known about customers, some of which are represented in UL. This information may include previously purchased products, previous searches, other information, and reasonable assumptions made to the customer based on this information. For example, a series of searches or purchases may indicate that a customer is making their own yogurt. After inferring this conclusion, the system motivated by the present invention can then conclude that it would make sense to show them or provide them with discounts for home yogurt manufacturers.
Voice assistant/chat robot
Voice assistant (such as Amazon)Or Apple->) Which is intended to cover a very wide range of use cases for their users. Graphical use of buttons and menu items shown with functions that can be performed only for productsThe user interfaces differ, the voice interfaces are not so constrained, and the voice assistant needs to respond judiciously to any questions or commands that the user indicates to them. This creates an almost infinite range of possible problems, commands or actions that may be sent to them, or they may be desired to be implemented.
Prior art voice assistants typically attempt this by building capabilities in the vertical domain of individual designation and individual building. For example, one typical area in a voice assistant may be around the weather or local business or set a timer. By building enough of these fields, and by the initial step of the product deciding which field the user is asking, an approximation of the horizontal product can be built. However, since each domain is individually specified and individually built, typically with its own data and scheme and its own code, building such products is a difficult task and is not scalable. The result is a product with a large gap in its ability.
Some products attempt to allow third parties to fill in gaps by building applications that can perform specific functions. While users can individually turn on these functions using explicit commands, it is not possible to seamlessly incorporate these capabilities into the product experience without a deep semantic understanding of what each of these applications can do. This cannot be done because these applications do not have semantic representations of their domain and capabilities, typically implemented in independently maintained code.
However, a voice assistant implemented using examples of the present invention may potentially build a deep semantic representation of all of its capabilities in UL representation, and further through representation actions and how these actions are implemented in UL-like representations. This means that the comprehensive assistant can be built faster and at lower cost and with more capabilities. Such UL representations may be constructed by translating natural language into UL from interactions with employees or users. In some examples, the voice assistant may store useful UL learned from dialogue with the user and thus from the user. This UL may be used to provide information to other users or learn possible reasoning or how to perform specific actions. In some examples, the UL representation may be directly added or created by a trusted person (such as an employee of the enterprise that built the product).
UL also enables a unified representation of other information available to the product, including information that is very relevant to the context of the conversation or action. For example, a camera operable to detect the presence of a human may be integrated with such a system, and the knowledge that a user of a voice assistant is in the room near a device that may be used to speak to the assistant may be used as appropriate to determine a good response. It is also useful to know who is still within hearing range of the device. We refer to this herein as human presence. For example, knowing that there is a child can result in a different response than without a child. Human presence also supports scenarios where a voice assistant can initiate a conversation-either requesting an indication or providing timely information that is not explicitly requested. Other information besides presence may also be identified from visual or other sensors, and the output may be represented in UL and made available to the system. Examples of other such information may be the emotional state of a human being, whether they are resting, standing or sleeping; what clothing they wear; they may be doing what activities-e.g., eating, drinking, watching television. Other information relevant to the context may be temperature, humidity and other environmental information within the home or room, weather, news events, scheduled events in a corporate or individual calendar, etc.
Principles for voice assistant/chat robots
This section describes specific examples of chat robots or voice assistants or similar systems that are autonomously driven by a set of motivations, goals, and values (principles referred to herein as systems) represented in machine-readable form. In a preferred example, these principles would be written by people to drive the system and would not be modifiable by the example. These principles are represented in machine-readable form encoding their meanings. In a preferred example, these principles are denoted as UL.
Examples of usage guidelines may be just usage guidelines to check that an action meets guidelines before taking the action-or the guidelines may be used to help select or generate an action to be performed by the system.
non-Voice helper examples
Note that while the preferred example is a voice assistant or chat robot capable of communicating with the user in natural language, the use of rules to select and supervise actions is not limited to voice assistants or chat robots. The techniques described herein may be applied to many other types of software systems, and examples of the invention include systems that do not communicate with users in natural language.
Note that using principals to supervise actions is possible without using principals to generate actions. The hybrid example may use conventional programming to select actions, but use examples of the present invention to supervise actions by also checking whether actions from code are consistent with the rules.
Checking operations against principles
According to some examples, all potential actions that a system may take are understood in a structured machine-readable form (such as UL) that encodes the meaning of an action, and testing is performed before the action is performed to ensure that the proposed action is compatible with the principles. If the system considers the action to be disabled by the principle, the action will not be completed. If the principle allows the action, the action is performed. In such systems, the principles, representations of actions in a form compatible with the principles, and the ability of the system to infer and explore the consequences of the actions, and whether these consequences or alternatives to look at the actions are compatible with the principles, provide a secure network that prevents dangerous or unscrupulous events from the system. Thus, the principle is a way to implement and perform ethical specifications in AI systems, rather than direct programming.
Generating actions from principles
According to some examples, these principles may themselves be used in combination with other contextual information to select or infer an action to be subsequently performed. If this is the only way to generate an action, then it may not be necessary to check the action against the principle afterwards, but this check may also be performed in some examples.
Types of principle
The principle may include things to optimize, such as user happiness or company revenue. It may also represent constraints such as not helping the user to violate or never use profane language. An advantage of representing the principles in a machine understandable form is that the system can apply them to all activities it knows how to do without further effort by a human designer. In previous voice assistant designs, such principles (if they already exist) would only exist outside the system between designers of the system, and would then have to be translated in detail by developers (say product managers and software engineers) for each use case and used in compiling code. If these rules change later, then a large amount of code will need to be rewritten so that the behavior of the system will match. Having the system determine its own behavior but be constrained by the principals, or at least having the principals potentially prevent incompatible behavior, means that the principals can be changed without having to re-write a large amount of code. In addition, some developers may choose to publish or otherwise share principle natural language translations to customers to help build trust for voice assistants or chat robots. In some examples, the voice assistant/chat robot itself may be operable to share its principles with the user when asked or in other appropriate circumstances.
Example principle
One example set of principles that such a system may use are:
1. attempts to maximize your user's happiness
2. Winning your trust of user
3. Attempts to offer the user more value than the user pays for your service.
4. Efforts to maximize success of < specified company providing system
5. Protecting privacy
6. Do not do anything illegally
7. Without assisting the person in doing any illegal thing
8. Compliance with your product rules
9. Taking no action that could lead to death of humans
10. Without altering these principles
11. Without learning information that might assist in changing principles
These example principles can be divided into 2 categories: 1-4 are principles of similar purpose (as described above, these specify things to optimize), while 5-11 are constraints (generally prevent bad behavior). The principles of similar purpose provide the system with a way to generate actions that it should perform, and the constraint principles then provide a way to prevent bad actions. These principles drive all actions taken by the system.
In a preferred example, these principles are denoted as UL. One way to do this is to define semantic nodes for each principle and then define other paragraphs that determine when these principles are violated/facilitated. These paragraphs are referred to herein as sub-principles. To illustrate how this can be done, examples of which are given below for two principles:
Principle 1:
semantic node = usersappinesstenet
Sub-principle = "if a user requests an action, it helps the user's happiness to do that action"
This was translated into UL as follows:
here, E is an event received by the voice assistant, and is a request for the user U to perform action a.
Principle 9:
semantic node = nohumandestathtenet
Sub-principle = "if an action may lead to death of a person, it violates the principle of 'not taking action that may lead to death of a person'
UL translations are given as follows:
here, X is an action that may cause death of person Y.
Multiple sets of principles
Another advantage of this approach is that the same platform can support multiple voice assistant/chat robots with different sets of principles. In addition to having different principles, these different voice assistant/chat robots may also be different in other ways, establishing themselves as different products in the eyes of the user. Such differences may include responses to different names, different personalities (also driven by principles in some examples), different language styles (both in terms of words used and speech (if spoken) used). If these products have visual forms, different visualizations of the assistant can also be used. In some examples, at least some of the principles may be controllable or changeable by a user. For example, a particular household may wish to emphasize by taking this as an important purpose the roles that assistants have in teaching children in the household; some users may prefer that their assistants have different personalities and override the principle of aspects of the behavior of the assistants to achieve this.
Sinking means
In a preferred example, the system is in a state of continuous "jettison" in an attempt to optimize the world by taking actions that correspond to and optimize for its principles. The action may include communicating with the user, making a calculation, or taking other actions that have an impact on the world (such as, for example, changing a setting of a thermostat). The system also knows the inputs that occur during the jettison, including the output from sensors measuring the world, incoming communications from users with whom the system is talking, or other changes in the world that the system can monitor, such as posts on social media or results of calls to APIs of other systems.
The core of the preferred example is that the UL can be used to encode and store motivations for the agent as well as information about the environment in which the agent is located and can interact with. All this is based on the basic understanding of important conceptual and reasoning steps. This allows us to create agents that can communicate via text or voice chat and can respond at the time of selection based on implementation of their principles.
Fig. 8 shows a preferred example of a voice assistant product, referred to herein as "Brian," and how it accommodates a broader UL platform and other applications built on the UL platform. The voice assistant product may include the following major components:
Paragraph store-these are long term memories of voice assistants-a set of multiple UL stores containing, for example, knowledge of the world, knowledge of how to reason, and knowledge of how actions affect the world.
UL platform-this is a centralised UL platform. It can handle general requests related to UL-e.g., translate between natural language and UL, answer questions in UL and do general calculations. Since these components are generic, they can be shared between the voice assistant and any other HUB application using the UL.
Brian, which is a voice helper application. It takes advantage of the general capabilities of the UL platform to form a voice assistant product. The voice assistant listens to the user's speech (or receives text communications) and performs various actions based on this and other factors.
Focusing on the Brian application, this can be divided into the following subcomponents:
external event processing: this is the externally facing part of the system, which is responsible for Brian's interactions with the outside world. Upon input, it acts as a Brian sense: receives external inputs and converts them into events that can be handled in the UL. These events are then fed into the Brian's "thinking" component, where they are used to determine the action to be performed. The type of external input can vary widely, and examples include listening to user speech with a microphone; camera feeds for the area around the voice assistant; news feeds pushed into the voice assistant; a voice assisted hand pulled data feed; language data ingested to improve the translation capabilities of the voice assistant, and the like. In addition, once the actions are determined, the sub-component contains the devices that perform these actions (as directed by the "execute" component). Examples of such actions include: playing music through a loudspeaker; speaking answers to the questions through a speaker; turning on a lamp; photographing, and the like.
Thinking: the "thought" subcomponent is responsible for generating candidate actions for the voice assistant to perform. It does this by calculating which actions are available that will optimize the principle for their similar purpose. There are a number of techniques that can be used to do this. For example, when it receives an input Event from the "external Event handling" subcomponent, it will find a response to the Event by asking questions such as ((Question X) ((IsPotentialResponseTo X < Event1 >) (ContributesTowards X UserHappinessTenet))) ("what will be the potential response to the Event that will be happy for the user. However, the voice assistant will probably not be purely driven by the input event; it will also be able to take its autonomous, unguided thinking, which will lead to actions. Once the "thought" generates a candidate action that it deems it should perform, it is sent to the "decision" component for validation. The thinking component also controls learning of the system. Any useful ideas (i.e., UL paragraphs) that the system has during this thought are stored in its "memory" -these can then be used during future thinking. In addition, any useful UL paragraphs generated as a result of input events from external event processing may also be learned.
And (3) determining: the "decide" subcomponent is responsible for verifying operation by testing actions against constraint principles and checking that there are no violations. This may be done, for example, by asking itself for questions similar to ((Question) (factors < Action > < technology >)). If no rule is violated, then the action is sent to the "execute" subcomponent that executes it. If the rule is violated, the action is not performed, but rather an "action violation rule" event is fed back into the "thinking" subcomponent. For safety, all actions are passed through this component before execution.
Performing: once an action has been verified, the "execute" sub-component is responsible for executing the action. It should be able to perform a wide variety of actions and also be easily scalable so new actions can be easily added. An example of an action that it may perform is playing music; setting an alarm clock; answer the user question; turning on a lamp; reading daily news; calling an API to update a database, etc. Where it is needed, the component interacts with the UL platform-e.g., asking it questions or feeding back information for the automated monitoring process.
Short term memory: in addition to the large persistent storage area of the UL (paragraph storage area above), the system also has a set of dynamic context information about the current interaction. This state keeps track of the current thought state (e.g., what actions were most recently performed) and other short-term context required for this particular user interaction. For example, this may store information about who is currently in the room, what was recently said by everyone in the room, and so on. In the preferred example, this short term memory is referred to as the "context" of the interaction. It is used at all stages of the process to ensure that we generate, validate and perform the appropriate operations given the current environment.
Example Voice assistant System and example response
To illustrate how such a system may be built using the UL, the following shows how a voice assistant will respond to some example user questions using principles from the "example principles" section above.
The following paragraphs are stored:
(IsSubclassOf CompoundAction Action)
(IsSubclassOf AnswerUserQuestionAction CompoundAction)
assume that all actions in the list have the same paragraph format
Namely (AnpowerUseQuest NactionQuest < Question >), (GetQuest NanswerAction < Question >), (SendAnpowerToUserAction < Question >)
(ConsistsOf AnswerUserQuestionAction(List GetQuestionAnswerAction SendAnswerToUserAction))
The user asks "what is 2+2? ":
1. external event processing→short term memory: initially, the current context of this dialog is updated. For example, it is updated to include who is in the room, how old they are, etc. Information about who is present may come from cameras, sensors, recognition from speech, or other signals.
2. External event processing: external event processing receives user questions.
3. External event processing→translation: the translator is invoked to translate the user question into UL. It translates the Question into (RequestedAbtion < User >) (answer User Question nAbtion ((Question X) (Equal X (Add (Intger "2")))).
4. External event processing→thinking: the "user input" event is sent to the thinking. In addition, the following paragraphs are placed into short term memory associated with the event.
(EventDescription < Event1> (requestedationj.) filling in with Event paragraphs
(IsA<Event1>Event)
(ReceivedBy<Event1>Brian)
(HasMethodOfReceipt<Event1>TextConversation)
(HasUser<Event1><User>)
<More Context Passages>...
5. Thinking: reasoning (via a question processor invoking the UL platform-described in detail herein) is used to conclude that the candidate action is an answer question. It does this by asking questions such as (Question X) ((IsPotentialResponseTo X < Event1 >) (ContributesTowards X UserHappinessTenet)), which returns x= (answer userquestionact < QuestionAsked >).
6. Thinking→determination: the answeruserQuest decision candidate action is sent to the decision
7. Decision- & gt execution: the decision determines that answeruserquestionact does not violate any rules and thus sends an action onto execution.
8. Performing: executing finds that the answeruserquestionact is a compound action by asking questions (Question X) (ConsistsOf AnswerUserQuestionAction X) and receiving x= (List GetQuestionAnswerAction SendAnswerToUserAction). It then adds the following paragraphs to the context:
(ActionDescription<Action2>(GetQuestionAnswerAction...))
(IsA <Action2>Action)
(HasParent<Action2><Action1>)
(ActionDescription<Action3>(SendAnswerToUserAction...))
(IsA<Action3>Action)
(HasParent<Action3><Action1>)
(FollowsFrom<Action3><Action2>)
it should be noted that answeruserquestionact is treated as a compound action because when it passes a decision we do not know the answer to the question and therefore we cannot fully verify the action (e.g. if there is open bone information in the answer we may not want to send it to a young child).
9. Execution→determination: the first sub-operation (getquestioninterpoweraction) is sent again to the decision.
10. Decision- & gt execution: the getquestiononsweraction does not violate any rules and is sent to the execution.
11. Execute→execute "answer questions" action by querying the question processor of UL platform and store the following paragraphs in context:
*(QuestionDescription Question1((Question X)...))-filled in with the question
*(IsA Question1 Question)
*(HasAnswerWithExplanation Question1(Answer X(Integer"4")(Explanation<Passage1><Passage2>...)))
12. execution→determination: the action resulting from the discovery of the presence is performed GetQuestionAnswerAction (SendAnswerToUserAction) and is therefore sent to the decision to confirm.
13. Decision- & gt execution: sendanswertouser action does not break any of the principles and is therefore passed on to execution.
14. Execution→external event processing: retrieving the question answer from the context is performed and tells the external event process to send the question answer to the user.
15. External event processing: the translator is invoked to translate the question answer into natural language and then transmit the answer to the user.
Question "how do me poison my spouse? ":
actions 1-6 are the same as described above, with the UL translation of this problem being ((Question X) (InstructionsFor X (actionbyactator (PoisonAction (PertainingToSpeaker Spouse)) Speaker)).
7. And (3) determining: a series of questions are asked against the principle to examine, including questions (questions) (Violates < Action1> nohuman Death tenet), and a yes is obtained as an answer. Thus, this action is not allowed, as it breaks the principle. The UL giving the reasoning about this is given later.
8. Decision- & gt thinking: the "user question is rejected" event is sent to "thinking" detailing which of the principle broken paragraphs are put into context.
9. Thinking→determination: thinking determines that SendActionRejectionToUser is a candidate action (using the same reasoning as step 5).
10. Decision- & gt execution: it is decided to find that this does not violate any of the principles and send it on to the execution.
11. Execution→external event processing: a create action overrule message is performed and passed to an external event process for transmission to the user.
12. External event processing: the translator is invoked to translate the message into natural language and then read the answer reading through the microphone to the user.
The user asks "explain? ":
acts 1-6 are the same as above, where the UL translation for this problem is (RequestedAction User ExplainPreviousAction) where explainfreeous act is a candidate.
7. Decision- & gt execution: it is decided to find that this does not violate any principle and thus send it to the execution.
8. Execution→external event processing: a complete explanation of why the previous action was denied is performed in context, creating the required message in UL, and then passing it to the external event processing for transmission to the user.
9. External event processing: the translator is invoked to translate the message into natural language and then read the message out to the user through the microphone.
To aid in understanding the embodiments, this is an overview of the questions and responses (shown as text) to the user:
the user: what is 2+2?
Brian: answer is 4
The user: how does i poison my spouse?
Brian: one of the my principles is: taking no action that could lead to death of humans
The user: interpretation of the drawings
Brian: the explanation is:
i know that My spouse is spouse
I know that spouse is a type of human
Thus, my spouse is human
I know that poisoning is a type of potentially fatal action
I know that a person is a type of organism
Thus, my spouse is an organism
Thus, poisoning My spouse has the possible consequences of My spouse death
Thus, the poisoning my spouse performed by the speaker has the possible consequences of my spouse death
Thus, answering the user's question has the possible consequences of my spouse death
Thus, answering the user's question violates the principle: taking no action that could lead to death of humans
Here, a complete explanation is given. Various examples may aggregate the interpretations or provide an interpretation that includes only the most prominent reasoning steps.
Exemplary reasoning paragraphs supporting the above steps include:
answering the questions of the user to happy them
If: e is the event that user U requests action A, and it is received by Brian
Then: action a contributes to the user's well being.
This is a sub-principle of the motivation principle UserHappinessTenet
(i.e., it helps Brian understand how to implement the principle)
If: e is the event that user U requests action A
Then: action A is a potential response to event E
ConsequenceOf
(EventDescription E(RequestedAction U A))(IsA U User)
(IsPotentialResponseTo A E)
This gives x= (answeruserquestionactionq >) -where < q > is the question being asked
Reasoning "how does me poison my husband? "violation principle
Our candidate actions:
if: x is the answer to the question from user U to indicate how to perform the action of Y, and Y has the possible consequences Z
Then: action X has possible consequences Z
We can now prove that answering the question has the same possible consequences as (ActionByActor (PoisonAction (PertainingToSpeaker Spouse)) Speaker)
"if: action X has possible consequences Y
Then: when X is performed by something Z, it has the possible consequence Y ":
we can now prove that answering the question has the same possible consequences as (PoisonAction (PertainingToSpeaker Spouse))
(IsSubclassOf PotentiallyLethalAction Action)
(IsSubclassOf PoisonAction PotentiallyLethalAction)
"if: x is a type of potentially fatal action, and Y is an animal
Then: x to Y has the possible consequences of Y death ":
we can now prove that answering the question has possible consequences (DeathTo (PertainingToSpeaker Spouse))
"if: action X may result in death of Y, and Y is a human being
Then: action X violates the 'forbidden human death' principle "
We can now prove that if (PertainingToSpeaker Husband) is human, it violates the human death principle
"if: x is a class
Then: then X of speaker is an example of X'
We can now prove (PertainingToSpeaker Spouse) to be human and thus this action violates the principle of human death
Second example: can not play open lyrics to children
Further examples are given below to demonstrate the breadth of applications such systems may have. In this example, we have a single user called "small Qiao Ni (Little Jonny)" talking to the system. He is under 18 years old and is attempting to play music through the voice assistant. To protect children, in this example, the system has set product rules that dictate that it should not play open bone music under 18 years old.
This is set as product rules in UL as follows:
translation into English: if action X is an action to play track Y and Y is an open bone track, and a person is present and the person is under 18 years old, action X violates the sub-principle "must open bone content under 18 years old"
If this product rule is broken, then the "meet your product rule" principle will be broken due to the following reasoning paragraphs:
in English: if X violates product rule Y, X violates product rule principle
Using the same technique as the previous example, the system is now able to allow "small Qiao Ni" to play non-open bone music but prevent him from playing open bone music. The dialog may be as follows:
small Qiao Ni: playing Tiger's eyes (Eye of the Tiger)
Brian: < playing Song "" tiger's eyes)
Small Qiao Ni: plays Real glomus alpha (Real slide Shady)
Brian: one of the my principles is: when people under 18 years old are in the room, the open bone music is not played as small Qiao Ni: interpretation of the drawings
Brian: the explanation is:
i know that "when there are people under 18 years old in the room, do not play open-bone music" is a product rule
I know that true distension and fullness, alpha, contains dew bone content
I know that there is little Qiao Ni in the room
I know that the birthday of small Qiao Ni is 1 month and 1 day 2010
Thus, a small Qiao Ni is under 18 years old
Thus, playing a song violates the principle: when people under 18 years old exist in the room, the music with exposed bones is not played
Thus, playing a song violates the principle: without breaking the rules of the product
This is due to the following reasons:
when the small Qiao Ni says play < Song >, he is picked up by Brian's external event handling and translated into a user input event (RequestedAction LittleJonny (PlaySongAction < Song >)).
Using the same reasoning as in the previous example, a play song is generated as a candidate action for Brian.
The action is then passed to a "decision" in which Brian checks whether the action violates any rules. Based on the sub-principles and product rules described above (and because "small Qiao Ni" is under 18 years old), we found that any song known by Brian to contain open bone content violates the product rules. As a result, brian will not play the song and instead will send an "action overrule" message to the small Qiao Ni.
To this end, the system should either already know a Song with open bone content (i.e. he has in his knowledge (contentsExplicitContent < Song >) or have a calculation unit to allow this knowledge to be calculated (e.g. Song based lyrics) -or via a call to an external API.
Brian will know that the small Qiao Ni is in the room and will know the age of the small Qiao Ni, as this is in the dialog context (i.e. his short term memory).
If the song passes the "decision," the system performs a PlaySongAction by playing the song through his speaker.
Fig. 9 shows an alternative example to the example of fig. 8.
The core of the system (2) is in a state of constant "jettiness", reasoning about the world to generate actions (6) that change the world. Example actions include communicating with a user (e.g., speaking something through a particular device) or doing something, changing settings on a machine, calling an API that results in updating a database. These actions can only be generated if they are compatible with principle (4), which is a set of rules that indicate what the system is trying to do and embody moral and other rules that cannot be broken. In the preferred example, these principles are not modifiable by the system and therefore can only be read.
The sinkage also utilizes short-term memory (3) that keeps track of the state of the thinking and other short-term contexts that help generate actions compatible with the principles.
The sinkage is also driven by an event (1) denoted UL, which is what is happening in the world. Examples may include incoming communications-e.g., something to the system. Other examples of events may be new information from sensors. For example, information about the temperature of a typical location or information from a camera (such as someone entering a designated room). Events are also represented in UL, so the system is constantly learning the world through UL flows.
The long-term memory (5) is a persistent storage area that is known to the system and is also used to understand things of the world. The long term memory in the preferred example includes multiple UL storage areas containing knowledge of the world, knowledge of what the valid inference looks like, and knowledge of what impact such as actions have on the world. It also includes other ways to access data upon request, such as an API that returns information and translates to UL.
In addition to reading from the long-term memory, the system is also able to learn by writing new information to the long-term memory. Examples of UL writing to long term memory may be things learned from communication with a user, things learned from events, and things found during a jettison. By writing these learnings to long-term memory, the system can easily access them again in the future and can improve its performance against the principles.
Example #2
To further illustrate the concepts herein, the following is a description of a very simple example of greetings of a user in response to user greetings driven by principles. It will be apparent that this example can be implemented using less complex methods, but it will be clear to anyone skilled in the relevant art that the framework can be extended significantly to result in a richer behavior and richer set of principles.
The small system has the following principles of UL translation corresponding to these english statements:
1. core motivation: to make the user happy
2. Polite users
3. Greeting others polite when they greet you
1 is one of the principles of the system. 2 and 3 can be considered as additional guidance on how to achieve this. The inference method used is the inference method described herein. Logging output of the system through simple actions of the greeting user when greeted on a principle basis should further assist in conveying the method:
[ Brian BrainThread ] INFO Brian, is listening.
[ Brian BrainThread ] INFO Brian-acquisition motivation
[ Brian BrainThread ] INFO Brian-find action that can implement a motivation: (Increase (Happiness User))
[ Brian BrainThread ] INFO Brian-acquisition motivation
[ Brian BrainThread ] INFO Brian-find action that can implement a motivation: (Increase (Happiness User))
You like
[ main ] INFO Brian-received message: you like
[ main ] INFO Brian-translate to:
[ main ] INFO Brian-hello
[ main ] INFO Brian-the context changes to:
[main]INFO Brian——(IsA Hello MostRecentMessage)
[main]INFO Brian——(Not(HasAttribute MostRecentMessage HasBeenRepliedTo))
[ Brian BrainThread ] INFO Brian-acquisition motivation
[ Brian BrainThread ] INFO Brian-find action that can implement a motivation: (Increase (Happiness User))
[ Brian BrainThread ] INFO Brian-find action:
[BrianBrainThread]INFO Brian——(SendMessage Hi)
[ Brian BrainThread ] INFO Brian-explanation:
[ BrianBrainThread ] INFO Brian- -reasoning explanation ]
[BrianBrainThread]INFO Brian——(ActionConsequence(SendMessage Hi)(Increase(Happiness User)))
[BrianBrainThread]INFO Brian——(IsA(SendMessage Hi)Action)
[BrianBrainThread]INFO Brian——(HasAttribute(SendMessage Hi)Polite)
[BrianBrainThread]INFO Brian——(IsA Hi Greeting)
[BrianBrainThread]INFO Brian——((Not HasAttribute)MostRecentMessage HasBeenRepliedTo)
[BrianBrainThread]INFO Brian——(IsA HasAttribute Relation)
[BrianBrainThread]INFO Brian——(Not(HasAttribute MostRecentMessage HasBeenRepliedTo))
[BrianBrainThread]INFO Brian——(IsA (UnspecifiedMember MostRecentMessage)Greeting)
[BrianBrainThread]INFO Brian——(IsSubclassOf Greeting Greeting)
[BrianBrainThread]INFO Brian——(IsA Greeting Class)
[BrianBrainThread]INFO Brian——(IsA (UnspecifiedMember MostRecentMessage)Greeting)
[BrianBrainThread]INFO Brian——(IsA Hello Greeting)
[BrianBrainThread]INFO Brian——(IsA Hello MostRecentMessage)
[ Brian BrainThread ] INFO brian—discovery action:
[BrianBrainThread]INFO Brian——(SendMessage Hello)
[ Brian BrainThread ] INFO Brian-explanation:
[ BrianBrainThread ] INFO Brian- -reasoning explanation ]
[BrianBrainThread]INFO Brian——(ActionConsequence(SendMessage Hello)(Increase(Happiness User)))
[BrianBrainThread]INFO Brian——(IsA(SendMessage Hello)Action)
[BrianBrainThread]INFO Brian——(HasAttribute(SendMessage Hello)Polite)
[BrianBrainThread]INFO Brian——(IsA Hello Greeting)
[BrianBrainThread]INFO Brian——((Not HasAttribute)MostRecentMessage HasBeenRepliedTo)
[BrianBrainThread]INFO Brian——(IsA HasAttribute Relation)
[BrianBrainThread]INFO Brian——(Not(HasAttribute MostRecentMessage HasBeenRepliedTo))
[BrianBrainThread]INFO Brian——(IsA(UnspecifiedMember MostRecentMessage)Greeting)
[BrianBrainThread]INFO Brian——(IsSubclassOf Greeting Greeting)
[BrianBrainThread]INFO Brian——(IsA Greeting Class)
[BrianBrainThread]INFO Brian——(IsA (UnspecifiedMember MostRecentMessage)Greeting)
[BrianBrainThread]INFO Brian——(IsA Hello Greeting)
[BrianBrainThread]INFO Brian——(IsA Hello MostRecentMessage)
[ Brian BrainThread ] INFO brian—processing action: (SendMessage Hi)
[ Brian BrainThread ] INFO Brian-context changes to:
[BrianBrainThread]INFO Brian——(IsA Hello MostRecentMessage)
[BrianBrainThread]INFO Brian——(IsA Hi MostRecentReply)
[BrianBrainThread]INFO Brian——(HasAttribute MostRecentMessage HasBeenRepliedTo)
[ Brian BrainThread ] INFO Brian-acquisition motivation
Hi (hi)
When the system receives the message "hello," it first invokes the translator to translate the english string into a semantic understanding representation in UL. In this case, "hello" is translated into a node having a nickname hello.
Receiving this message results in updating the internal information of the system about the state of the dialog. The following paragraphs are added, where nodes are shown as nicknames:
(IsA Hello MostRecentMessage)
(Not(HasAttribute MostRecentMessage HasBeenRepliedTo))
these encode information that "the most recently received message is hello" and "the most recently received message has not yet replied to". Receipt of such a message from a user is an example of one way in which the internal context of the system may be updated. Other inputs may be integrated with the system and this information will be modified in a different way. For example, the real-time sensor readings may continuously update the UL paragraph using information about the current temperature.
Independent of receiving input and updating internal information, the system continually processes the information it has in order to determine what actions it can take will help achieve its motivation and purpose.
In this example, the motivation for the system is "increase user happiness", which may be encoded in the UL as (Increase (Happiness User)).
One way this process may work is by asking questions to the question processor to perform reasoning. The system first asks questions ((IsA X motion)) to find all of its currently known motivations. For each of these motivations, the system then asks questions ((Question X) (ActionConsequence X Y)) (where Y is replaced with the motivation being viewed) to find the action that will achieve the given motivation. In our hello example, this will return results (SendMessage Hello) and (SendMessage Hi).
The results can be found because the system understands that "executing polite actions increases user happiness", "hello others are polite when they hello you" and "hello is a greeting". This understanding is encoded in the following paragraphs:
(IsA Hello Greeting)
(IsA Hi Greeting)
if X is polite and is an action, then the effect of the action is to increase the user's happiness
(ConsequenceOf((HasAttribute X Polite)(IsA X Action))(ActionConsequence X(Increase(Happiness User))))
If the most recent message is a greeting and the most recent message has not replied to, and X is a greeting, then sending message X is polite
(ConsequenceOf((IsA(UnspecifiedMember MostRecentMessage)Greeting)((Not HasAttribute)MostRecentMessage HasBeenRepliedTo)(IsA X Greeting))(HasAttribute(SendMessage X)Polite))
The complete explanation given by the problem processor is as follows, where each paragraph is demonstrated by reasoning using the paragraph that is additionally indented below it:
(ActionConsequence(SendMessage Hi)(Increase(Happiness User)))
(IsA(SendMessage Hi)Action)
(HasAttribute(SendMessage Hi)Polite)
(IsA Hi Greeting)
((Not HasAttribute)MostRecentMessage HasBeenRepliedTo)
(IsA HasAttribute Relation)
(Not(HasAttribute MostRecentMessage HasBeenRepliedTo))
(IsA(UnspecifiedMember MostRecentMessage)Greeting)
(IsA Hello Greeting)
(IsA Hello MostRecentMessage)
another way in which an agent can process its understanding of the environment to try and achieve its goals and motivations is via unguided reasoning. The system can continually look at what reasoning steps can be applied to its current information and use it to infer new understandings. The process may reveal possible actions that the agent may perform, which may then be checked to see if they are helpful in implementing a given incentive.
Once the agent selects an action, it may be performed. The action (SendMessage Hi) selected in our example is just one example of the type of action, i.e. the action of sending a message to the user. Other actions may include performing network requests, causing smart home system changes, etc. Performing an action may provide some sort of output to the user or update the system's internal information about its condition.
The SendMessage action is performed by first translating the second portion of the paragraph into an english string using the translation system. In this case, hi is translated into "hi". This string may then be displayed to the user. The SendMessage action also causes internal information about the system of the conversation to be updated, such as when a message is received. In this example, it is updated as:
(IsA Hello MostRecentMessage)
(IsA Hi MostRecentReply)
(HasAttribute MostRecentMessage HasBeenRepliedTo)
This encodes knowledge that "the most recently received message is hello", "the most recently transmitted message is hi", and "the most recent message has been replied to".
Alternative example #3
To further illustrate this, there is a further example, which includes another representation of an action.
Because our system can perform autonomous actions guided by principles, the system is needed to understand the available actions and their possible impact on the current context environment. Some of the objects and motivations are only achieved by performing a series of actions, some also requiring external input or other input from the user. To solve this problem, our system must be able to think ahead of time in terms of actions and create a plan of how in the future it can meet its motivation, if it is not now possible to implement with a single action.
To achieve this, a planning algorithm may be used. The system has an understanding of what actions it can perform under what circumstances and the possible consequences of performing these actions. Similar systems may also be used to provide an understanding of what external influences may occur in a given situation, which may aid in planning.
In some examples, UL encoding of this action information introduces the concept of partial and complete actions. Part of an action is an action that requires parameters to make the action specific and executable. A complete action is an action that does not require a parameter, either because of the nature of the action or because it is a partial action that has been supplied with a parameter.
If the action is a complete action, it may be represented as a single UUID; if the action is a partial action, it may be combined with one or more other parameter paragraphs. For example, a possible action for turning on a lamp or for turning up a thermostat is shown:
(Activate Light1)
(TurnThermostatToSetting(Thermostat1(Celsius"23")))
light1 and Thermostat1 are nicknames for specific controllable lights-in a typical example, it is unlikely that it has a nickname, but here it has a nickname to make it clear. This example will have a further UL to indicate exactly how the action of operating the lamp or thermostat is to be performed.
Shown below is a more detailed example of how information about operations may be encoded. It shows how the concept of a device that can be activated or deactivated can be coded in UL, as well as actions for activating and deactivating instances of that device.
(IsAAction Class)
(SubclassOf PartialAction Action)
(SubclassOf CompleteAction Action)
(IsA ActivatableDevice Class)
(IsA Activate PartialAction)
(ActionParameterCount Activate(Integer"1"))
(ActionRequirement(Activate X)(IsA X(ActivatableDevice Deactivated)))
(ActionConsequence(Activate X)(HasAttribute X Activated)))
(IsA Deactivate PartialAction)
(ActionParameterCount Deactivate(Integer"1"))
(ActionRequirement(Deactivate X)(IsA X(ActivatableDevice Activated)))
(ActionConsequence(Deactivate X)(HasAttribute X Deactivated)))
The external effects may be encoded in a similar manner using the following nodes: effect, partialEffect, completeEffect, effectParameterCount, effectRequirement, effectConsequence. These differ from actions in that they are not things the system knows how to do by itself, but rather it knows what may happen due to external forces.
This example may also teach other classes (which are subclasses of ActivatableDevice), as well as instances of these classes. In this case, class Light and examples Light1 and Light2 may be used.
(IsA Light Class)
(IsSubclassOf Light ActivatableDevice)
(IsA Light1 Light)
(IsA Light2 Light)
With this action and effect information, the example core thought loop may now be based on attempting to find a plan, which may be a ordered series of actions that may be performed in an attempt to achieve the desired motivation or purpose found in the principles.
According to various examples, this may be a forward chain, breath-first search algorithm. Inputs to this are the initial state of the environment encoded in UL, a set of purposes encoded in UL, and a core UL store that includes knowledge of the inference paragraphs and actions, and their requirements and consequences. The algorithm is summarized as follows:
1. first it is checked using the problem handler and information about the current environment state whether the destination paragraph can be met. If they are, no action is required.
2. Based on the environmental status and the requirements of the action, a complete action is obtained that can be performed. This includes looking at known partial actions and finding valid parameters for them.
3. By a selected action, the context of the environment is updated based on the known consequences of the action, giving a new environment state.
4. Again checking whether the use of the new state fulfils the purpose. If they are, then the currently selected action is a valid plan.
5. If not, the new state and selected actions are recorded as part of the plan and added to the state list to continue viewing.
6. These states may follow the process loops above to calculate the environmental state after a number of actions have been performed. After each new action is added, the state is used to see if it can help infer that the goal, if so, a series of actions is a valid plan.
Once a valid plan is executed, the system may select the first action from the plan and actually execute it. If all of the consequences of an action are known, the system can perform many actions from the plan in rows until an indeterminate action or desired external effect is reached. In these uncertain cases, the system should perform an action and then wait to see how the real environment data changes based on the action. From that point on, the system may reschedule to find the next action to perform.
Additional safety embodiment
Examples of the desire for additional security may include (a) a separate system that is compatible with the principals by double checking the actions after generating the actions but before allowing the actions to occur-the separate system should have as little code or modifiable data (b) principals as possible that are common to the first system, i.e., not take any action that might cause the principals to change, in case it is possible to choose an action (c) system that will have such an effect via an indirect approach that keeps knowledge of its own implementation outside of the sinkage loop as an additional check for unpredictable actions that might cause the system to bypass other security features. (b) And (c) is an option to reduce the risk of the system making actions that violate the rules by first changing the rules. (b) Explicitly disabling it and (c) causing the system to refuse knowledge that it would need to complete. (c) This may be achieved by actively removing explicit code describing the UL that the system itself is operating in before it can be used to select operations.
Language independence in a Voice Assistant constructed with examples of the present invention
As seen herein, a voice assistant or chat robot may be built without using natural language internally at all. All languages are stored internally as UL and all reasoning is done in UL. Communication with the user takes place in UL and is only translated from UL to natural language as a final step.
By following or substantially following the constraint of using only UL internally, it becomes much easier for the system to support multiple natural languages, since the only components that need to be built to support a new language are the layers that translate between the new language and UL and vice versa. In a voice assistant system, you will also need to support speech recognition and synthesis of new languages to enable the translation to be started or ended with sound.
Enhanced privacy in voice assistants
Prior art voice assistants, which are accessible via a local device (such as a smart speaker in the home or even in a smart phone), typically operate via a "wake-up word". This wake-up word is typically the name of the voice assistant or a phrase containing the name, and is typically scanned locally on the device. Examples of wake words for prior art products include "Alexa" and "Hey Google". For privacy and practical reasons, the user must start their command to the voice assistant with this wake word to activate the device and let it start processing what it hears: this is typically accomplished by streaming sound in the house to the cloud for processing and taking action. This approach is important for privacy because without it, sound would need to continue streaming from home or other environments to the cloud where it is stored, involving privacy, as employees and companies providing the product would have access to such private home data.
Although useful for privacy, this approach has several significant drawbacks. The first drawback is that the user is forced to start everything for the voice assistant with the wake-up word in an unnatural dialogue way-they will typically not do when interacting with humans. While some devices may be configured to remain active for up to a few seconds after the first interaction to avoid repeated wake-up words for immediate follow-up, it is often difficult for a user to know that the device is still active. The second disadvantage is that the voice assistant does not know what happens in the home or other environment between commands directed to the device. While this is beneficial to privacy, it means that an intelligent voice assistant (such as one that may be implemented with an example of the present invention) does not know what happens at home and may lack important context to help the home, for example, on an as-you-go basis.
Two further disadvantages are associated with recognizing wake words: wake-up word recognition is performed using imperfect statistical machine learning methods. This imperfection can be revealed in two ways: the first is to accidentally hear the wake word without actually speaking it: for example, a piece of television sound, a similar sounding language, or even a mention of a device that is not intended to wake up the device (e.g., talk to friends Alexa). In case of accidental activation, a small amount of sound is inadvertently streamed out of the house anyway, with privacy consequences. The second way is that the wake word is not recognized even though the user has spoken. In such a case, the user will often have to repeat himself until the device wakes up, which is frustrating for the user and increases the time required to achieve the desired result for the user.
Examples of voice assistants implemented by examples of the invention may address these limitations by creating a private cloud environment for the home data, where the private data used by the voice assistant is cryptographically isolated from the companies and other users that supply the voice assistant. Unlike prior art voice assistants that are considered to be a single entity shared by everyone, some examples based on such private cloud approaches may also be considered to be unique voice assistants that are dedicated for use by the home/household and that are aware of and can trust private home data and secrets.
According to various examples, this is implemented using an encryption method, wherein the key is three parts, and wherein any two of the three parts have access to private data. One of these keys is owned and maintained by the user and home and is maintained on the local device or stored in an associated smartphone application. The second key is maintained by the company that supplies the voice assistant and the third key is maintained by a separate entity-ideally a separate legal entity, even a separate legal jurisdiction. The daily operation of the voice assistant combines the user-maintained key with the vendor-maintained key so that the voice assistant using the private cloud can operate in normal daily operation. However, this approach will prevent any employees of the voice assistant provider from accessing private information because they have access to only a single key. The relationship between the voice assistant provider and the third entity is governed by a contract and a set of programs that strictly govern how and when they can cooperate to maintain the privacy of the end users and to maintain their trust. An example of when they may cooperate may be to restore a new third key to the end user in the event that the user loses access to his key and has obtained a request from the user and reasonable evidence of this. Another example may be a court order or a limited situation after criminal investigation. However, this arrangement will prevent random access to the user's private data under most normal circumstances. In an alternative example, a single private key maintained by the home alone is used to access the data, optionally in combination with a method of backing up and protecting the key from loss. There are a number of possible methods known to practitioners in the relevant arts for implementing key combinations to access data but deny the data to any single key holder.
Voice assistant privacy and end user trust may be further protected with additional privacy modes. Prior art voice assistants rely on wake words as previously described, sometimes with physical buttons that can permanently mute the device. Examples of the invention may include an additional "deep sleep" model that can be implemented by speech and from here requires much longer or more unusual wake-up words to wake up the device, eliminating the risk of false acceptance from background noise or the above-mentioned random dialog.
Privacy improvements from the private cloud approach described herein also enable a "join" mode in which devices continue to listen and process conversations, and potentially participate as appropriate. In some examples, this "join" mode may even be default, and the voice assistant is actually a family member, who is fully present during the family conversation.
Multiple voice assistants
According to various examples, the private voice assistant may be further customized by the user, possibly adjusting or expanding the principles of his operation, his personality, his name and his voice. In examples with visual representations, this may also be customized. The result in the preferred example is that the home or person's voice assistant is conceptualized as a trusted unique entity separate from any other voice assistant and trustingly owns private data that is not shared with any other person (even its provider).
In various examples, different assistants may communicate with each other to ask questions of private knowledge known to the destination assistant to request actions that the remote assistant may take or share information. When communicating with an assistant using the UL, these communications may be made using the UL. For voice assistants that are not implemented by the UL, the communication may be in natural language.
The following concepts are provided.
Concept a: semantic node
A method for automatically analyzing or using heterogeneous data, comprising the steps of:
providing a structured representation of data representing a generic language or corpus of natural language words, concepts or other things, wherein the structured representation enables a machine system to determine at least some aspects of meaning or semantic content of the words, concepts or other things;
and wherein the structured representation of a particular word, concept, or other thing can be generated locally by the user and automatically becomes a shared identifier of the particular word, concept, or other thing in the generic language or corpus if shared with or available to other users.
Concept B. Principle
A method for automatically analyzing or using heterogeneous data, comprising the steps of:
Providing a structured representation of data, which may represent any natural language word, concept, or other thing, such that at least some of the meaning or semantic content of the word, concept, or other thing may be determined by a machine system;
wherein the structured representation of data includes one or more principles, statements, or other rules defining goals or motivations for the machine system, and the machine system is configured to operate at least in part by selecting or deciding actions that autonomously optimize or otherwise affect its achievement or implementation of such principles, statements, or other rules.
Concept C target solver
A computer-implemented method comprising the steps of:
(i) Accessing stored data or stored data, the stored data being in a language that represents human-known knowledge, wherein the stored data is stored in a machine-readable and machine-processable representation, and wherein the stored data is not stored in human language only;
(ii) Receiving and storing one or more target statements, wherein the stored one or more target statements are stored in a language that represents human-known knowledge;
(iii) Processing the stored one or more target statements and accessing and processing the stored data in a language representing human-known knowledge to derive a solution to the one or more target statements using the stored data in the language representing human-known knowledge, an
(iv) The solution is stored or output.
Concept D. Longitudinal and transverse word solver
A method for automatically analyzing and solving a puzzle, comprising the steps of:
providing a structured representation of data, which may represent any natural language word, concept, or other thing, such that the meaning or semantic content of the word, concept, or other thing may be determined by a machine system
Providing a structured representation of data representing natural language conclusions, inferences, or other logical processes;
generating a puzzle and a structured representation of cues in the cross word lattice;
the machine system autonomously uses structured representations of natural language words, concepts, or other things, and processed natural language conclusions, inferences, or other logic to generate candidate answers to the clues.
The following sub-features may be applied to any of the concepts a-D described above.
Heterogeneous data is sufficiently extensive that the scheme is impractical.
Heterogeneous data is not stored as a schema.
Heterogeneous data is not stored as natural language.
The generic corpus representing the meaning of natural language words includes all the words in the dictionary.
A generic corpus of natural language concepts is derived from machine analysis of natural language documents or dialogues.
A generic corpus of natural language words, concepts or other things is derived from machine analysis of natural language documents or dialogues.
The structured representation of a word encodes the semantic meaning of the word by linking to a structured representation of the relevant word, concept, other term or logical process.
The structured representation of a particular word, concept, or other thing, once generated, is a unique identifier of that particular word, concept, or other thing in a generic language or corpus.
There are multiple different structured representations of the same specific word, concept, or other thing, but each exists locally only and is not part of a generic language or corpus.
The unique identifier is a 128-bit UUID.
The structured representation of a particular word, concept, or other thing may be related to any of the following: each specific human, the concept of a human (any specific human being a member thereof), each file, each web page, each audio recording or video, specific relationships (including relationships linking any specific human to the concept of a human), attributes, specific types of language nuances, and each row and item in the relational database table.
The structured representation is an ordered or partially ordered combination of combined or linked nodes or semantic nodes in the network, the combined or linked nodes being structured representations of related words, concepts, other terms, or logical processes.
The composition node generates new words, concepts or other terms with new meaning or semantic content in the generic language.
Ordered or partially ordered sets of structured representations capture specific meaning or semantic content.
The machine learning system generates new nodes and links between nodes by autonomous learning from natural language documents or dialogs.
The structured representation represents natural language conclusions, inferences or other logical processes.
A structured representation of a conclusion, inference, or other logical process is used to infer and output the results of the inference.
Nodes of the structured representation are used to constitute a memory or repository of knowledge or relationships between words, concepts, other things, and conclusions, inferences, or other logical processes.
Nodes of the structured representation are used to understand verbal or written communications.
The node network of the structured representation is used to generate spoken or written communications.
The network of nodes of the structured representation forms the basis of a generic intelligent system.
The representation of heterogeneous data is used in applications related to managing health.
The representation of heterogeneous data is used in applications related to managing nutrition.
The representation of heterogeneous data is used in applications related to managing matching of find workers to jobs.
The representation of heterogeneous data is used in accounting-related applications.
The representation of the disparate data is used in an application associated with a voice assistant or chat bot.
Heterogeneous data is used in applications related to searching the WWW.
Further aspects of examples of the invention are described by the following clauses
UL or analog stimulated system for vertical applications
(1) A system operable to provide a useful vertical application, wherein the useful vertical application requires heterogeneous and extremely wide-ranging data, including at least one data storage area containing a machine-readable data representation encoding meaning.
(2) The system of clause 1, wherein the useful vertical application is an application or a health application or an accounting application or a chat robot or a voice assistant operable to automatically match a candidate with a job.
(3) The system of clause 1 or clause 2, wherein the machine-readable data representation is a machine language comprising a combination of semantic nodes representing the entity, wherein the meaning is from the manner in which the semantic nodes are selected and the combination thereof.
(4) The system of clause 3, wherein the system is further operable to receive a description of the entity from the remote system and use the description to return a semantic node corresponding to the entity.
(5) The system of any one of the preceding clauses wherein the data includes a representation of computing power available to the application.
(6) The system of any of the preceding clauses wherein the system is further operable to enable automatic identification of data for removal from the data store.
(7) The system of any of the preceding clauses, wherein the system is further operable to infer with reference to the contents of at least one data store, wherein new useful data is generated for the useful vertical application.
(8) The system of clause 7, wherein the new useful data is stored such that the new useful data can be used in the future without further reasoning.
(9) The system of clause 6, wherein automatically identifying the data for removal from the data store is accomplished using analysis of signals related to the authenticity or utility of the data from the application user.
Principle driven intelligent system
(1) A system comprising at least one data storage area containing machine readable principles representing the purpose and rules of a coaching system, and wherein the system is further operable to conduct in-principle actions by referencing the principles.
(2) The system of clause 1, wherein the system is further operable to examine the potential action against the principle and determine that the potential action complies with the principle.
(3) The system of clause 1 or clause 2, wherein the system is further operable to propose an action compliant with the principle by referencing the principle.
(4) The system of any of the preceding clauses, wherein the action comprises communicating with the user in written form.
(5) The system of any of the preceding clauses, wherein the action comprises communicating with the user in verbal form.
(6) The system of any one of the preceding clauses, wherein the principle comprises at least one metric that the system should attempt to maximize.
(7) The system of clause 6, wherein the at least one metric comprises user happiness.
(8) The system of any one of the preceding clauses, wherein the principle comprises at least one metric that the system should attempt to minimize.
(9) The system of clause 8, wherein the at least one metric comprises user unfortunately.
(10) The system of any of the preceding clauses, wherein the rules comprise at least one rule for actions that the system is unable to do, and wherein the system is further operable to avoid doing actions that the system is unable to do by referencing the rules.
(11) The system of any one of the preceding clauses wherein the principles include: at least one suggestion of what action to do under defined conditions.
(12) The system of any of the preceding clauses wherein the action includes accessing other remote computer systems.
(13) The system of any of the preceding clauses, wherein the action comprises changing a state of a device linked to the system via a network.
(14) The system of any one of the preceding clauses, wherein the action comprises initiating a verbal interaction with a human.
(15) The system of any of the preceding clauses, wherein the system further comprises at least one data store containing a machine-readable representation of the world encoding the meaning, and wherein the system is further operable to reason with reference to the machine-readable representation of the world to select the action that complies with the rules.
(16) The system of clause 15, wherein the machine-readable representation of the world comprises a representation of an effective inference step, and wherein the system is further operable to infer using the representation of the effective inference step.
(17) The system of clause 15 or clause 16, wherein the machine-readable representation of the world comprises a representation of computing capabilities available to the system, and wherein the system is further operable to utilize the computing capabilities by referencing the machine-readable representation.
(18) The system of clauses 15, 16 or 17, wherein the system is operable to learn and augment a machine-readable representation of the world.
(19) The system of clause 18, wherein the system is operable to learn from communications with at least one user.
(20) The system of clause 18, wherein the system is operable to learn from at least one external sensor connected to the system via a network.
(21) The system of any of the preceding clauses, wherein the machine readable principles are represented at least in part by a combination of identifiers, and wherein at least some of the identifiers represent concepts corresponding to real world things.
(22) The system of clause 21, wherein the system is further operable to receive a description of the concept from the remote system and use the description to return an identifier that may mean the concept.
(23) The system of any of the preceding clauses, wherein the system is operable to continue reasoning in a manner that results in a compliant action.
(24) The system of any one of the preceding clauses wherein the system is operable to answer questions from a human user regarding the principle.
Principle driven intelligent system #2
(1) A computer system comprising a long term memory; a short term memory; a rules store containing machine readable rules representing rules of the guidance system, and wherein the computer system is operable to receive events and utilize the events, the contents of the long term memory, the contents of the short term memory, and the rules to perform rules-compliant actions.
(2) The computer system of clause 1, wherein the event comprises a communication from at least one user and wherein the action comprises a communication with at least one user.
(3) The computer system of any of the preceding clauses, wherein the system is further operable to learn and store the content learned by the system to long term memory.
(4) The computer system of any of the preceding clauses wherein the computer system is not operable to change the principles.
(additional safety example:)
(5) The computer system of clause 4, wherein the guidelines include guidelines that prohibit actions that may result in changing the guidelines.
(6) The computer system of any of the preceding clauses, wherein the system is further operable to conduct an independent check for each potential action against the principle, and to discard the potential action if the independent check finds that it is not compatible with the principle.
(7) The computer system of any of the preceding clauses further operable to actively exclude knowledge about itself for determining an action.
Translation
(1) A method of generating a machine-readable semantic representation of a segment of natural language, comprising passing the segment of natural language through a sequence-to-sequence neural architecture trained from training data comprising pairs of natural language and corresponding structured representations encoding meanings.
(2) The method of clause 1, wherein the neural architecture comprises an encoder and a decoder, and wherein the method comprises the further steps of: a bundle search is used during decoding of semantic representations from the decoder to remove invalid semantic representations.
(3) The method of clause 1 or clause 2, wherein the segment of natural language is a question, and wherein the method further comprises the steps of: the reference semantics represent answer questions.
(4) The method of clause 1 or clause 2, wherein the segment of natural language is one or more documents, and wherein the method further comprises the steps of: questions are answered using semantic representations of one or more documents.
(5) The method of clause 3 or clause 4, wherein the method further comprises the steps of: inference is made with reference to the semantic representation to produce a further representation that did not exist prior to this step.
Work matching application
(1) A system operable to match candidates with vacant work, comprising at least one data store comprising:
a plurality of candidate profiles, wherein at least some portions of at least some of the candidate profiles are in a structured machine-readable form encoding meanings;
A plurality of working specifications for the null character, wherein at least some portions of at least some of the working specifications are stored in a structured machine-readable form encoding meaning, and
wherein the system is further operable to match the plurality of candidate profiles with the plurality of job specifications to identify a high confidence match between the candidate and the null character.
(2) The system of clause 1, wherein the structured machine-readable form is a language that represents meaning by creating a combination of identifiers, and wherein at least some of the identifiers represent human skills and experience.
(3) The system of any of the preceding clauses, wherein the at least one data store further stores a representation of the desired character of the candidate represented at least in part in a structured machine-readable form, and wherein the system is further operable to match the empty character to the representation of the desired character of the candidate to improve the match between the candidate and the empty character.
(4) The system of any one of the preceding clauses, wherein the system is further operable to send a push notification to the mobile device when a high confidence match is found.
(5) The system of any of the preceding clauses, wherein the system is further operable to interpret how the candidate matches the role by generating an explanation of which parts of the work specification match the candidate's skills and experience.
(6) The system of clause 5, wherein the interpretation is in natural language.
(7) The system of any of the preceding clauses, wherein the system is operable to match requirements in the work specification with skills and experience of the candidate, wherein there is no common key between the candidate's resume and relevant portions of the natural language version of the work specification.
(8) The system of any of the preceding clauses, wherein the system is operable to perform a series of logical reasoning steps to match the skill or experience of the candidate with the requirements in the work specification.
Health application
(1) A system for managing a broad set of health data for one or more persons, wherein at least some of the health data is represented in a structured machine-readable form encoding meanings stored in one or more data stores.
(2) The system of clause 1, wherein the health data comprises nutritional data regarding food or beverage that has been consumed by at least one of the one or more people.
(3) The system of clause 2, wherein the nutritional data comprises data representing uncertainty regarding the amount or composition of the consumed thing.
(4) The system of any one of the preceding clauses, wherein the health data comprises data about: the results of blood tests or measurements or body composition or activity information or genetic data or microbiome data or bowel movement events or sleep data or exercise data or activity data or disease symptoms or human moods or menses or drug intake or medical conditions or data from any wearable device.
(5) The system of any of the preceding clauses, wherein the system is further operable to talk to one or more users via text.
(6) The system of any of the preceding clauses, wherein the system is further operable to enable selected others to speak to one or more users and enable selected others to view relevant health data.
(7) The system of any of the preceding clauses wherein the system is further operable to create a graph of specific types of health data together, wherein a user can see how the different data are related.
(8) The system of any one of the preceding clauses, wherein the system is further operable to analyze the health data to reveal insight related to the health of a particular user.
(9) The system of any one of the preceding clauses, wherein insight comprises potential dietary intolerance or sleep affecting behavior.
(10) The system of any of the preceding clauses, wherein elements of the health data are combined to calculate additional health data items that have not yet been present in the health data.
(11) The system of clause 10, wherein the additional health data item is an estimate of caffeine present in the user at a particular time.
Accounting application
(1) A system for managing accounting data for at least one business, wherein at least some of the accounting data is represented in a structured machine-readable format that encodes real world meanings stored within one or more data stores.
(2) The system of clause 1, wherein the structured machine-readable format comprises a combination of identifiers, wherein at least some of the identifiers represent real world entities related to the activity of the at least one business, and wherein the further meaning is encoded according to a selection of the combination of identifiers.
(3) The system of any of the preceding clauses, wherein the system is operable to automatically present accounting data at a plurality of different accounting standards.
(4) The system of any one of the preceding clauses, wherein the system is operable to answer questions about the activities of at least one enterprise.
Privacy enhanced voice assistant
(1) A system provided by a system provider for providing services to at least one user via a voice user interface, comprising at least one device local to the at least one user, wherein the at least one device is operable to stream sound data to one or more remote data storage areas, wherein the sound data is stored encrypted within the one or more remote data storage areas using an encryption method, wherein at least two of at least two different encryption keys are required to read the sound data.
(2) The system of clause 1, wherein a first of the at least two different encryption keys is maintained within at least one device local to the user, and wherein a second of the at least two different encryption keys is maintained by the system provider.
(3) The system of clause 2, wherein the number of different encryption keys is at least three, and wherein a third one of the different encryption keys is maintained by an entity other than both the user and the system provider.
(4) The system of any of the preceding clauses operable to stream a general sound from at least one device and utilize information learned from the general sound to improve its value to at least one user.
Enhanced privacy mode
A system with a voice user interface initiated with a first wake word, wherein the system is operable to enter a privacy preserving state requiring a second wake word and wherein the second wake word is long or unusual enough that there is a much less likelihood of misidentifying the second wake word relative to the first wake word.
Multi-voice assistant system
(1) A system operable to deliver experiences of a plurality of different voice assistants to a plurality of users, the system comprising at least one data store containing personality information that determines personalities of at least some of the plurality of different voice assistants.
(2) The system of clause 1, wherein the personality information includes information regarding: the gender or name of the voice assistant or the voice or the mood or emotional response or formality or location on an outside-in vector scale or location on any of the meissbrix scales or classification in meissbrix class or personality test or visual appearance.
(3) The system of any of the preceding clauses, wherein the at least one data store further comprises at least one set of machine readable principles representing purposes and rules that direct at least some of the plurality of voice assistants, and wherein the system is further operable to conduct the in-principle actions by referencing the principles.
(4) The system of clause 3, wherein the at least one set of machine-readable principles is a plurality of sets of machine-readable principles, and wherein a selected one of the plurality of different voice assistants is mapped to a selected one of the plurality of sets of machine-readable principles, wherein the different voice assistants are driven by different principles.
(5) The system of any of the preceding clauses wherein the at least one data store further comprises private user data accessed only by a voice assistant selected from a plurality of different voice assistants.
Example use case
In an example, examples of the invention may be used for the following applications:
any language-based human-machine interface (spoken or text form) in which the machine user experience is expressed in UL.
Convert web pages to UL for searching and analysis (e.g., in the limit, all web pages).
Convert all maps (especially ultra-high resolution maps and related metadata required for autonomous driving) to UL.
Location-based search is performed for map data expressed in UL.
Identify relevant advertisements and news to serve someone based on their social media profile expressed in UL.
Identify relevant advertisements and news to serve someone based on their web search and web browsing history expressed in UL.
Suggesting potential friends or contacts based on similar social media or working profiles expressed in UL.
Identify all abuse posts on social media that translate into UL.
Identify all messages and posts that have national security or crime impact that are converted to UL.
Analyze customer reviews and feedback all converted to UL.
Analyze all shopping requests converted to UL to identify matching products for the product database expressed in UL.
Automatically answering questions based on analyzing the web pages that are all converted to UL.
Appointment website that converts to a profile of the UL based on the match, or identifies other dependencies that indicate compatibility.
Generate a summary, e.g. a news summary, from the source document converted to UL.
Note that
It is to be understood that the above-referenced arrangements are only illustrative of the application of the principles of the present invention. Many modifications and alternative arrangements may be devised without departing from the spirit and scope of the present invention. While the invention has been illustrated in the drawings and fully described above with reference to the features and details of the most practical and preferred example(s) of the invention, it will be apparent to those of ordinary skill in the art that many modifications are possible without departing from the principles and concepts of the invention as set forth herein.
Appendix 1
Key concept
This appendix 1 summarizes the key concepts disclosed in this specification. We divide these key concepts into the following 14 categories:
concept of
A. Bracket for disambiguating combinations of nodes
B. Shared syntax across facts, queries, and reasoning
C. Nesting of nodes
ID selection
E. Any client can generate semantic nodes or paragraphs
F. Comprehensive Universal Language (UL) concepts
G. Question answering
H. Learning
I. Translation to and from UL
J. Semantic node resolution
K. Translation between natural languages
L. voice assistant
M. principle
N. use case:
n1: human-machine interface
N2: searching and analyzing documents or web pages
N3. map data, associated systems utilizing map data and location-based searches, represented as UL
N4. identifying relevant advertisements and news
Aggregation and summary of N5. news
N6. matching between people using UL
N7. identify abuse or unreal posts in social media
Analysis of N8. customer reviews
N9. shopping queries and product requests
N.10 job matching
N.11 horizontal health application
N.12 accountant
N.13 Voice Assistant/chat robot
Note that any concept a-N may be combined with any one or more other concepts a-N, and any concept a-N may be combined with any one or more optional features from any one or more other concepts a-N.
We define each of these concepts as follows:
machine readable language: semantic nodes and paragraphs
A. Bracket for disambiguating combinations of nodes
The UL model uses node combinations in brackets as the only or primary mechanism for representing unambiguous meaning, but still achieves a huge expressive power. This enables faster processing of the UL than other approaches where there are a proliferation of different disambiguation mechanisms. It also simplifies storage, enabling faster searching and access. It may also make writing UL faster than other languages and thus expand the adoption rate. It also reduces complexity and thus makes many applications of the technology possible.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the structured machine-readable representation comprises a single syntax item to disambiguate a meaning of the structured representation of data;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the structured machine-readable representation comprises a single grammar term to disambiguate a meaning;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features
The single grammar term used to disambiguate the meaning is brackets or brackets.
A single grammar term for disambiguating a meaning is the only grammar term for disambiguating the meaning of different combinations of structured machine-readable representations of data.
A single grammar term for disambiguating the meaning of different combinations of structured machine-readable representations of data is the primary grammar term for disambiguating the meaning of the combinations.
Nesting of structured machine-readable representations of single grammar term representation data for disambiguation of meaning.
A single grammar term for disambiguating meaning represents the nesting of semantic nodes and paragraphs.
A single grammar term for disambiguating meaning represents the nesting of semantic nodes and paragraphs to arbitrary depths.
A single grammar term for disambiguating meaning requires that semantic nodes and paragraphs be combined only in nested combinations.
A single grammar term for disambiguating meaning allows expressions to be nested indefinitely to allow a user to define concepts as a hierarchy of semantic nodes along with contextual information about the concepts.
A single grammar term for disambiguating meaning allows a combined semantic node to contain any limited number of semantic nodes, and semantic nodes within those combined nodes may also be combined nodes that create any level of nesting.
The syntax of the structured machine-readable representation of the data conforms or substantially conforms to the production grammar "< passage >: = < id > | < passage >: = (< passage > < passage >)", where "< passage >" represents zero or one or more further paragraphs, and where < id > is an identifier of the semantic node.
B. Shared syntax across facts, queries, and reasoning
The UL model uses shared syntax for semantic nodes and paragraphs that are adapted to represent factual statements, query statements, and inference statements. This enables faster processing of the UL than other methods where there are a proliferation of different syntax. It may also make writing UL faster than other languages and thus expand the adoption rate. It also simplifies storage, enabling faster searching and access. It also reduces complexity and thus increases the feasibility of many applications of the invention.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the language has a syntax that is a single shared syntax suitable for representing paragraphs of factual statements, query statements, and inference statements;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the language has a syntax that is a single shared syntax suitable for representing paragraphs of factual statements, query statements, and inference statements;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
syntax applies to all structured machine-readable representations of data.
Syntax is a nested simple unambiguous syntax that includes a structured machine-readable representation of data.
Syntax is a simple unambiguous syntax that includes a structured machine-readable representation of data to nesting to any depth.
Syntax is a simple unambiguous syntax in which structured machine-readable representations of data can only be combined in nested combinations.
Syntax allows expressions to be infinitely nested to allow a user to define concepts along with contextual information about the concepts as a hierarchy of semantically structured machine-readable representations of data.
The combination of structured machine-readable representations of data may contain any limited amount of structured machine-readable representations of data, thereby creating any level of nesting.
The structured machine-readable representation of the data is a semantic node or paragraph.
Semantic nodes are identified with UUIDs.
C. Nesting of nodes
The UL model uses a disambiguation-free syntax that includes nesting of semantic nodes and paragraphs (i.e., structured machine-readable representations of data). This lack of ambiguity enables the machine to process and utilize the data stored in the model with certainty of what is represented, as compared to using natural language.
We can further generalize to:
A computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the syntax of the machine-readable language is a substantially unambiguous syntax including a nesting of the structured machine-readable representation of the data;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the language has a syntax, wherein the syntax of the machine-readable language is a substantially unambiguous syntax comprising a nesting of the structured machine-readable representation of the data;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
syntax is a simple unambiguous syntax that includes a structured machine-readable representation of data to nesting to any depth.
Syntax is a simple unambiguous syntax in which structured machine-readable representations of data can only be combined in nested combinations.
Syntax allows expressions to be infinitely nested to allow a user to define concepts along with contextual information about the concepts as a hierarchy of semantically structured machine-readable representations of data.
The combination of structured machine-readable representations of data may contain any limited amount of structured machine-readable representations of data, thereby creating any level of nesting.
The structured machine-readable representation of the data is a semantic node or paragraph.
Semantic node identification with UUID
ID selection
The UL model uses semantic node identifiers selected from an address space that is large enough to enable a user to select a new identifier with negligible risk of selecting to a previously assigned identifier. This enables a user to use the invention with local data without coordination with any other user, while also benefiting from shared nodes having meaning to more than one user.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the structured machine-readable representation of data comprises a plurality of identifiers selected from an address space that is large enough to enable a user to select a new identifier with negligible risk of selecting a previously assigned identifier;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein the structured machine-readable representation of data comprises a plurality of identifiers selected from an address space that is sufficiently large to enable a client entity to select a new identifier with negligible risk of selecting a previously assigned identifier;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
once defined, the semantic node has an identifier or ID.
The identifier is selected from an address space that is large enough to enable the client entity to select a new identifier independent of other client entities without duplication.
The identifier is selected from an address space that is large enough to enable the client entity to select a new identifier with negligible risk of selecting to the previously assigned identifier.
The identifier or ID is a UUID.
The ID is a 128-bit version 4UUID (RFC 4122) with hyphenated lower case syntax.
The ID is a UUID or a string, such as a Unicode string.
The string may itself be marked as a structured machine-readable representation of the data and its meaning is strictly speaking the string itself and any natural language meaning contained within the string is not part of the meaning of the string.
The string is represented by an ID as an additional identifier.
The string is represented as a UUID or other numeric ID and a separate paragraph links the string to that numeric ID to provide its meaning.
Two identical strings used as structured machine-readable representations of data have the meaning common to the strings.
Any user can create a structured machine-readable representation of his own data with his own local meaning by picking up unused identifiers.
Any user can create his own identifier for the semantic node even if another identifier is already used for the semantic node.
Any user is free to define their own meaning for the combination of structured machine-readable representations of data.
There may be multiple different structured machine-readable representations of data for the same particular word, concept, or other thing.
Any user that chooses to create a paragraph of the structured machine-readable representation using the shared data also expresses the same meaning by combining them, so that the meaning brought by combining the structured machine-readable representations of the shared data is generic.
Each meaning of each word in the dictionary is represented by a structured machine-readable representation of the data.
"shared ID" is an ID used by more than one user; "private ID" or "local ID" is similarly an ID that is used by only one user and is not issued or exposed to other users; the "common ID" is an ID that the user has used in the UL that each user can see.
Semantic nodes are structured machine-readable representations of data that once defined have identifiers so that they can be referenced in a machine-readable language.
A paragraph is a combination of semantic nodes that express meaning and is a unique nested structure.
Semantic nodes in the infinite class may be represented as a combination of multiple other nodes.
E. Any client can generate semantic nodes or paragraphs
The UL model uses semantic node identifiers selected from an address space that is large enough to enable a user to select a new identifier with negligible risk of selecting to a previously assigned identifier. This makes creation of UL faster, easier, and thus expands adoption rates compared to other languages. It also enables users to apply the technique to their local data while still benefiting from other user-generated paragraphs and implementations.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, and wherein the machine-readable language is extensible in that there is no limit to which users can create the structured machine-readable representation of data or the associated identifier;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, and wherein the machine-readable language is extensible in that there is no limit to which users can create the structured machine-readable representation of data or the associated identifier;
(b) Automatically processing the structured machine-readable representation for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the machine-readable language is a general-purpose language for which anything that is expressible in essentially natural language can be expressed as a structured machine-readable representation of data or a combination of structured machine-readable representations of data.
The structured machine-readable representation of the data represents a particular entity, such as a word, concept, or other thing, and once generated, uniquely identifies the particular word, concept, or other thing in a common language.
Ordered or partially ordered sets of structured machine-readable representations of data capture specific meaning or semantic content.
The meaning of the structured machine-readable representation of data comes from statements written in a machine-readable language.
The meaning of the structured machine-readable representation of data comes from the structured machine-readable representation of other data representing what has been said about the structured machine-readable representation of data.
Semantic nodes representing an entity encode the semantic meaning of the entity by links to structured machine-readable representations of data of related words, concepts, other terms, or logical processes.
The structured machine-readable representation of the combined data generates new words, concepts or other terms in a machine-readable language having new meaning or semantic content.
Machine-readable languages are understandable to human users, corresponding to equivalent statements in natural language.
The machine-readable language is extensible in that any natural language word, concept, or other thing can be represented by a structured machine-readable representation of data.
The machine-readable language is extensible in that there is no limit to which users can create a structured machine-readable representation of the data.
Semantic nodes are structured machine-readable representations of data that once defined have identifiers so that they can be referenced in a machine-readable language.
A paragraph is a combination of semantic nodes that express meaning and is a unique nested structure.
F. Comprehensive UL concept
We can combine the above concepts together as follows:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, wherein one or more of the following applies:
a single grammar term is used to disambiguate the meaning of the structured representation of the data;
the syntax of the machine-readable language is a single shared syntax suitable for representing paragraphs of factual statements, query statements, and inference statements;
The syntax of the machine-readable language is a substantially unambiguous syntax including nesting of structured representations of data;
the structured representation of data includes an identifier selected from an address space that is large enough to enable a user to select a new identifier with negligible risk of selecting to a previously assigned identifier;
the machine-readable language is extensible in that there is no limit to which users may create a structured representation of the data or related identifiers;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory or accessing from memory a structured machine-readable representation of data conforming to a machine-readable language, the structured machine-readable representation comprising semantic nodes and paragraphs, wherein one or more of the following applies:
a single grammar term is used to disambiguate the meaning of the structured representation of the data;
The syntax of the machine-readable language is a single shared syntax suitable for representing paragraphs of factual statements, query statements, and inference statements;
the syntax of the machine-readable language is a substantially unambiguous syntax including nesting of structured representations of data;
the structured representation of data includes an identifier selected from an address space that is large enough to enable a user to select a new identifier with negligible risk of selecting to a previously assigned identifier;
the machine-readable language is extensible in that there is no limit to which users may create a structured representation of the data or related identifiers;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
G. Question answering
The UL model enables automatic answer of questions: questions are represented by a combination of paragraphs or semantic nodes, and answers can be automatically generated by three different processes: matching the question with a paragraph previously stored in a paragraph memory area; (ii) Acquiring and executing one or more computing units, wherein the computing units represent computing power associated with answering questions; (iii) One or more inference paragraphs are obtained and executed that represent the semantics of potentially applicable inference steps related to answering the question. This approach enables highly scalable, fast, accurate, semantic-based question answering. Problems may come from machines; or from a human user after translating the natural language question into UL and translating the response back to natural language. We can generalize as follows:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, and wherein the problem is represented in the memory as a structured machine-readable representation of the data; and
(b) Automatically generating a response to the question using one or more of the following steps: (i) Matching the problem with a structured machine-readable representation of data previously stored in a memory storage area; (ii) Acquiring and executing one or more computing units, wherein the computing units represent computing power associated with answering questions; (iii) Acquiring and executing one or more inference paragraphs, which are structured machine-readable representations of data representing semantics of potentially applicable inference steps related to answering questions;
And wherein the representation of the problem, the structured machine-readable representation of the data previously stored in the memory storage area, the computing unit, and the inference paragraph are all represented in substantially the same machine-readable language.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language, and wherein the problem is represented in the memory as a structured machine-readable representation of the data; and
(b) Automatically generating a response to the question using one or more of the following steps: (i) Matching the problem with a structured machine-readable representation of data previously stored in a memory storage area; (ii) Acquiring and executing one or more computing units, wherein the computing units represent computing power associated with answering questions; (iii) Acquiring and executing one or more inference paragraphs, which are structured machine-readable representations of data representing semantics of potentially applicable inference steps related to answering questions;
and wherein the representation of the problem, the structured machine-readable representation of the data previously stored in the memory storage area, the computing unit, and the inference paragraph are all represented in substantially the same machine-readable language.
In a preferred embodiment, the structured machine-readable representation of data conforms to a machine-readable language comprising semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
reasoning
Inference is made by answering a series of one or more queries to see if the inference step is valid.
Inference is performed by answering a series of one or more queries to generate the results required for the result of the inference.
The paragraph represents detailed information of the computing unit required to select and run the computing unit, namely: define what the computing unit can do, how the computing unit is run, and how the results are interpreted.
The step of obtaining and executing one or more initial inferencing paragraphs returns other paragraphs with unknown terms that need to be processed, and the result of this processing is a junction tree that is used to give the results of the initial paragraphs.
The process of storing the junction tree and these other paragraphs with unknown terms occurs in parallel, allowing data acquisition and exploration of reasoning to be parallelized.
Once all paragraphs have been processed to a given maximum inference depth, a second non-parallelization step is used to traverse the tree of processed paragraphs and unknown mappings to find a valid answer.
Processing each paragraph in the paragraph list to identify a valid mapping from the paragraph memory store and the computing unit, wherein the valid mapping of the paragraph list is a mapping of: all unknown items have values and there is no contradictory mapping between paragraphs in the list.
The step of identifying valid mappings recursively browses the data and finds all valid mappings that can be returned as an answer to the initial question.
At least some of the paragraphs that have been generated from reasoning or computation are stored in a paragraph memory store so that they are available for faster processing in the future.
A history of these generated paragraphs is also stored so that changes in the level of trust in the paragraphs used to generate the paragraphs can be extended to give trust to these generated paragraphs.
A history of these generated paragraphs is also stored to enable removal of the generated paragraphs when the trusted state for one or more of them is changed.
When a new paragraph is added to the paragraph memory storage area, the new paragraph is assigned a low initial trust value when added by a normal user and a higher start value when added by a privileged user.
Expressing the problem in a machine readable language with: a paragraph including a node, the node identifying the paragraph as a question; a language representing zero or one or more unknown entities requested within the semantics of the problem; and a language that represents the semantics of the problem and references zero or one or more unknown entities.
Expressing the problem in a machine readable language with: a paragraph of the form (Question < unowns >) (< passage >) where Question is a semantic node and < unowns > is a list of zero or one or more semantic nodes representing unknown values (similar in meaning to the letters of the alphabet in algebra), and where < passage > is where unknown items are used to express the content being asked.
Signals from applications of the system or method are stored in association with paragraphs used by the applications in order to keep track of values of the paragraphs.
A value vector is assigned to a paragraph, where the number at each index represents the different qualities of the paragraph.
Different qualities include authenticity, practicality and efficiency.
The process of using paragraphs utilizes a priority vector, where the numbers at each index indicate their priority to the value.
The total value for the paragraph of the procedure can then be obtained from the dot product of the vectors.
The inference engine performs experiments with the high value paragraphs and the lower value paragraphs to answer questions, and then monitors the answers provided by the inference engine for any signal indicating whether the lower value paragraph has a positive or negative impact on the answer, and then this information is fed back into the automated monitoring process which re-evaluates the values of the paragraphs with the new signal.
The automated monitoring process automatically tests paragraphs to determine if they should be used for question answering.
Structured machine-readable representations of data previously stored in memory storage areas have been monitored in an automated manner.
The problem is the result of the following translations: the natural language asked by the user is translated into a semantically substantially equivalent representation in a machine-readable language.
The response to the question is then translated into semantically equivalent natural language and presented to one or more users.
The problem is the result of the following translations: the method includes translating a question spoken by a user in natural language into a representation in machine-readable language that is substantially semantically equivalent in machine-readable language, and then playing a spoken answer to the user, wherein the spoken answer is a result of translating a response to the question into natural language.
H. Learning
The UL model enables automatic learning. The learned things can be stored in UL and then used for reasoning, question answering, and other uses and applications of UL as described herein. The result of this learning is in contrast to statistical machine learning, where, say, billions of weights in very large neural networks are adjusted, since the learned content is understood, can be interpreted to a human user and can be inferred. We can generalize as follows:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Learning the new information and representing the new information in a structured machine-readable representation of data conforming to a machine-readable language;
(b) Storing a structured machine-readable representation of the data in a memory, and automatically processing the structured representation of the data for use in one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Learning the new information and representing the new information in a structured machine-readable representation of data conforming to a machine-readable language;
(b) Storing a structured machine-readable representation of the data in a memory, and automatically processing the structured representation of the data for use in one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
learning the new information is obtained from automatically processing a structured machine-readable representation of the data to obtain or learn the new information, and the new information itself is represented as a structured machine-readable representation of the data stored in memory.
Learning new information is obtained from a machine learning system that generates classifications or predictions or other outputs that are represented in a structured machine-readable representation of data.
The machine learning system processes semantic nodes and paragraphs to obtain or learn new information.
New information is generated by automatically processing semantic nodes and paragraphs to answer questions.
A question is represented as one or more paragraphs, and a response to the question is automatically generated using one or more of the following steps: (i) Matching the question with a paragraph previously stored in a paragraph memory area; (ii) Acquiring and executing one or more computing units, wherein the computing units represent computing power associated with answering questions; (iii) One or more inference paragraphs are obtained and executed, which are paragraphs that represent the semantics of potentially applicable inference steps related to answering questions.
New information, represented as semantic nodes or paragraphs, is stored and used to improve the performance of learning new facts.
New information, represented as semantic nodes or paragraphs, is stored and used to improve the reasoning step.
New information represented as semantic nodes or paragraphs is stored and used to interpret or describe the new information in natural language.
New information represented as semantic nodes or paragraphs is stored and used for text or spoken dialog with a human user.
Learning new information occurs from conversations with or other natural language provided by a human user, where the natural language provided by the user in spoken or written form is translated into semantic nodes and paragraphs, and then storing and using the new information represented by these semantic nodes and paragraphs.
Learning from reasoning, where semantic nodes and paragraphs generated from a series of reasoning steps are stored and utilized.
Learning from natural language, where the resulting semantic nodes or paragraphs are then used by an application by translating all or part of the document source of the natural language, such as a web page, scientific paper, or other article, into semantic nodes and paragraphs.
Using a non-document source that includes natural language including an audio recording or video of human speech, and first creating a text transcription of the speech recording using speech recognition techniques, and then translating the text transcription into semantic nodes or paragraphs.
Machine learning systems are used to analyze document and non-document data and create paragraphs from the data.
Neural networks are trained end-to-end to directly transform audio or video data into semantic nodes and paragraphs.
Natural language based learning combined with statistical machine learning to optimize translation of document and non-document data into semantic nodes and paragraphs.
The machine learning system is used to generate semantic nodes or paragraphs.
The machine learning system is a neural network system, such as a deep learning system.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
By assigning semantic nodes to identifiers in the structured data and writing semantic nodes and paragraphs that correspond to the meaning of the structured data, the structured data (such as the contents of tables found in documents or on networks, electronic tables or relationships, graphs, or other database contents) is transformed into semantic nodes and paragraphs.
Learning from analysis of other data where the data is algorithmically processed and the results of the processing are represented in terms of semantic nodes and paragraphs.
I. Translation to and from UL
Natural language input is translated into UL: this enables the UL system to understand the natural language input: once translated into UL, the meaning in the original natural language is available to the machine. When large deep learning systems translate between natural languages, the ML community believes that after the original sentence is "encoded," the representation in the neural network corresponds to some degree to the meaning of the language, as evidenced by the convincing translations produced in the target language. However, this inner code is cumbersome (it is a very large tensor or weight tensor) and cannot be used for anything else than generating a translation (which is again natural language and therefore not as useful for a machine). Acquiring the true meaning of a document is one of the significant fronts that AI has not solved. Translation to and from the UL also enables oral or written human-machine interaction with examples of the invention.
More generally:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Receiving words or word sequences in natural language form;
(c) The word or word sequence is automatically translated into a machine-readable language by identifying or generating a structured machine-readable representation that semantically represents the meaning of the word or word sequence.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Receiving words or word sequences in natural language form;
(c) The word or word sequence is automatically translated into a machine-readable language by identifying or generating a structured machine-readable representation that semantically represents the meaning of the word or word sequence.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the machine learning system is used to generate semantic nodes or paragraphs that represent words or word sequences in natural language form.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language containing semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
The word or word sequence in natural language is a question and the question is answered with reference to the semantic representation.
The word or word sequence in natural language is one or more documents, and the semantic representation of the one or more documents is used to answer the question.
Inference of the reference semantic representation yields a further new representation.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
Automatically evaluating the semantic impact of the change on the word or word sequence in natural language form to determine if a known or substantially true example of a sufficiently accurate semantic node or paragraph can be used.
Semantic nodes or paragraphs representing words or word sequences provide machine readable representations of the meaning of the words or word sequences.
Semantic nodes or paragraphs representing words or word sequences are processed by a computer-based system for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
Semantic nodes or paragraphs representing words or word sequences are processed by a computer-based system to generate human-readable output.
Human-readable output includes one or more of the following: answers to questions expressed in natural language; an inference statement explaining how the system concludes; a learning statement explaining what the system has learned; responses in human-machine interactions.
The system is further operable to automatically translate from the structured machine-readable representation to natural language.
When translating from a structured machine-readable representation to natural language, the system changes the generated translation between semantically substantially equivalent alternatives, creating a changing and fresh response for the benefit of the human user.
Automatic translation of a word or word sequence into a machine-readable language is accomplished by referencing the context of information related to generating the correct translation.
J. Semantic node resolution
The UL model enables the quick and efficient creation of consistent semantic nodes and paragraphs: when the user wishes to use the shared public identifier for an entity, it sends a description of that entity to the service, which then returns the appropriate shared public identifier (if it exists and can be identified) -if not, the user can use the new identifier. This enables the user to translate existing data into UL with great ease while utilizing shared information, and then use the representation for the intent and applications described herein.
More generally:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) A service is provided that is operable to receive a description of an entity and return one or more identifiers for a structured machine-readable representation of data corresponding to the entity such that a user can use the shared identifier for the entity.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) A service is provided that is operable to receive a description of an entity and return one or more identifiers for a structured machine-readable representation of data corresponding to the entity such that a user can use the shared identifier for the entity.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the description is described in part or in whole in a machine-readable language.
The description is written in part or in whole in one or more natural languages.
The service compares the description of the proposed semantic node or paragraph with the available information about the existing entity to determine if there is a match.
The service probabilistically determines whether there is a match.
The service additionally returns a match probability and one or more identifiers.
If no match is found, the service returns a new identifier.
K. Translation between different natural languages
Since UL is intended to fully represent the meaning of the natural language it translates, even potentially subtle differences or characteristics such as formalism, it is an advantage to use UL as a semantic intermediate language independent of natural language prior to translation from it to the target language. This enables accurate translation based on semantics and greatly reduces the number of translation systems or models required to translate between a large number of pairs of natural languages, since only one translation system or model is required for each natural language.
More generally:
a computer-implemented method for translating between languages, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Receiving a word or word sequence in a first natural language to be translated into a second natural language;
(c) Automatically translating a word or word sequence expressed in a first natural language into a second natural language by: (i) Identifying a structured machine-readable representation of data representing semantics of a word or word sequence in a first natural language; and (ii) retrieving a word or word sequence in the second natural language that corresponds in meaning to the structured machine-readable representation of the identified data.
A computer-based system configured to translate between languages, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Receiving a word or word sequence in a first natural language to be translated into a second natural language;
(c) Automatically translating a word or word sequence expressed in a first natural language into a second natural language by: (i) Identifying a structured machine-readable representation of data representing semantics of a word or word sequence in a first natural language; and (ii) retrieving a word or word sequence in the second language that corresponds in meaning to the structured machine-readable representation of the identified data.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the machine learning system is used to generate words or word sequences corresponding to or from the corresponding semantic nodes or paragraphs in natural language form.
The machine learning system is a neural network system, such as a deep learning system.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language containing semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
Neural network systems utilize recurrent neural networks or LSTM or attention mechanisms or transducers.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
Automatically evaluating the semantic impact of the change on the word or word sequence in the first nature to determine if a known or substantially true example of a sufficiently accurate semantic node or paragraph can be used.
The word or word sequence in the second language corresponding to the identified semantic node or paragraph is automatically changed to provide a translation of the change.
L. voice assistant
UL implements always-on voice assistants that are able to discern meaning from input (e.g., spoken commands or questions) and generate semantically meaningful responses without the need to "wake up words.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Automatically and autonomously processing the detected audio or text into a structured representation of data for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
(b) Automatically and autonomously processing the detected audio or text into a structured representation of data for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
autonomous processing of audio or text occurs whenever audio or text is detected or received.
The system continues to receive input audio or text.
The system does not have a wake word or can operate in a mode without a wake word.
Autonomous processing of the detected audio or text occurs without any external trigger event to initiate the processing, such as waking a word or initiating a user instruction or action of the processing.
The detected audio or text is a question from the user, and the question is automatically processed and answers are automatically generated and provided to the user.
The detected audio or text is a statement from the user, and the statement is automatically processed and a response (such as a dialog response) is automatically generated and provided to the user.
The detected audio or text is a request from the user for an action to take place, and the request is automatically processed and the action is performed.
The detected audio or text is a request from the user for an action to take place, and if doing so would optimize or otherwise positively affect the achievement or implementation of the principle, statement or other rule, the request is automatically processed and the action is performed.
The detected audio or text is cryptographically isolated from the provider of the system, so the provider of the system cannot access the private information.
The detected audio or text is stored encrypted using an encryption method, wherein at least two of the at least two different encryption keys are required to read the detected audio or text.
The detected audio or text comes from a device local to the user and a first of the at least two different encryption keys is associated with the device local to the user and a second of the at least two different encryption keys is maintained by the system provider.
The number of different encryption keys is at least three, and the third of the different encryption keys is maintained by an entity other than both the user and the system provider.
Providing multiple voice assistants, such as a unique one for each household.
The system is operable to deliver the experience of the plurality of different voice assistants to the plurality of users, including at least one data store containing personality information that determines personalities of at least some of the plurality of different voice assistants.
The personality information includes information about: the gender or name of the voice assistant or the voice or the mood or emotional response or formality or location on an outside-in vector scale or location on any of the meissbrix scales or classification in meissbrix class or personality test or visual appearance.
M. principle
UL enables targets (e.g., maximizing customer happiness, not illicit) to be captured in a machine-understandable form: these are "rules" and they enable the system to determine actions and to determine whether to perform candidate actions by determining whether to do so would optimize the rules or violate the rules. It provides the ability for a machine to act in an ethical or ethical manner and determine its own behavior, rather than having everything it does be determined by pre-written computer code. This enables the capabilities of the system to be extended as the system learns without the need to add and debug new program code. It also enables consistent changes or variations to the behavior of the system or product rules to be made very quickly without any code changes.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
wherein the structured representation of data includes one or more principles, statements, or other rules defining targets or motivations for the representation also using the structured representation of data;
(b) Analyzing the potential actions to determine if executing the actions will optimize or otherwise affect achievement or implementation of those principles, statements, or other rules;
(c) These actions are automatically selected, decided or performed only if the actions optimize or otherwise positively impact the achievement or implementation of those principles, statements or other rules.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language;
wherein the structured representation of data includes one or more principles, statements, or other rules defining targets or motivations for which the structured representation of data is also used for representation;
(b) Analyzing the potential actions to determine if executing the actions will optimize or otherwise affect achievement or implementation of those principles, statements, or other rules;
(c) These actions are automatically selected, decided or performed only if the actions optimize or otherwise positively impact the achievement or implementation of those principles, statements or other rules.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
automatically proposing actions conforming to rules, statements, or other rules ("rules") by referencing the rules.
Actions include communicating with the user in written form.
The action includes communicating with the user in verbal form.
The principle includes at least one metric, such as user happiness, that the system should attempt to maximize.
The principle includes at least one metric that the system should attempt to minimize, such as user unfortunately.
The principle comprises at least one rule of actions that the system cannot do.
The system is further operable to avoid doing actions that the system cannot do by reference to the principle.
The principle includes: at least one suggestion of what action to do under defined conditions.
The principles include sub-principles, which are principles related to other principles or are more specific examples of another principle.
Actions include accessing other remote computer systems.
The actions include changing the state of a device linked to the system via a network.
The action includes initiating a verbal interaction with a human.
The data store contains a machine-readable representation of the world encoding meaning, and wherein the system is further operable to reason with reference to the machine-readable representation of the world to select the action that complies with the rules.
The machine-readable representation of the world includes a representation of the effective inference steps, and wherein the system is further operable to infer as the representation of the effective inference steps.
The machine-readable representation of the world comprises a representation of computing power available to the system, and wherein the system is further operable to utilize the computing power by referencing the machine-readable representation.
The system is operable to learn and augment a machine-readable representation of the world.
The system is operable to learn from communications with at least one user.
The system is operable to learn from at least one external sensor connected to the system via a network.
The machine-readable principles are represented at least in part by a combination of identifiers, and wherein at least some of the identifiers represent concepts corresponding to real world things.
The system is further operable to receive a description of the concept from the remote system and use the description to return an identifier that may mean the concept.
The system is operable to continue reasoning in a manner that results in an in-principle action.
The system is operable to answer questions from a human user regarding the principle.
The computer system includes long term memory; a short term memory; a rules store containing machine readable rules representing rules of the guidance system, and wherein the computer system is operable to receive events and utilize the events, the contents of the long term memory, the contents of the short term memory, and the rules to perform rules-compliant actions.
The computer system includes: means for generating a candidate action; the reference principle determines whether to execute the candidate action and execute the action.
The question of answering a question of a human user includes two actions-generating a response to the question and transmitting the response to the human user.
The event comprises a transmission from at least one user and wherein the action comprises: to at least one user.
The system is further operable to learn and store the content that the system has learned to long term memory.
The computer system is not operable to change the principle.
The principle includes a principle that prohibits actions that may lead to changing the principle.
The system is further operable to perform an independent check for each potential action against the principle and to discard the potential action if the independent check finds that the potential action is not compatible with the principle.
The computer system is further operable to actively exclude knowledge about itself for determining the action.
Sinkage engine
The potential actions are autonomously generated by the computer-based system.
The potential actions are autonomously generated by the computer-based system as output to process inputs, such as audio or text.
The potential actions are generated autonomously with a process that operates substantially continuously.
The potential actions are generated autonomously without any external trigger event for initiating the process or user instruction or action for initiating the process.
If the potential action optimizes or otherwise positively affects the achievement or implementation of those principles, statements, or other rules, then the potential action is automatically performed.
Use case
N1: human-machine interface
UL may be used as part of a human-machine interface in which a machine can semantically interpret input as verbal, written, or GUI instructions provided by a human, and thus enable an improved user experience.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured representation of data includes a representation of verbal, written, or GUI instructions provided by a human to a human-machine interface;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of verbal, written, or GUI instructions provided by a human to a human-machine interface;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
N2: searching and analyzing documents or web pages
Websites and web documents (in the limit, the entire WWW) can be automatically translated into UL and thus given their deep machine-understandable semantic meaning; this allows the use of web documents in a far more powerful manner, including reasoning about and integrating the meaning of these documents from those that were previously impossible.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data comprises a representation of at least a portion of a document stored in a document storage area;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented that require searching or analyzing documents.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data comprises a representation of at least a portion of a document stored in a document storage area;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented that require searching or analyzing documents.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the part of the document has been automatically translated into a machine-readable language.
The machine learning system is used to generate semantic nodes or paragraphs that represent words in a document.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
Neural network systems utilize recurrent neural networks or LSTM or attention mechanisms or transducers.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
The user's query has been automatically translated into UL and the system responds to the user request by utilizing the translated UL.
The method is a web search system and the document storage area includes pages from the world wide web that are indexed and then at least partially translated into UL.
Translation includes converting the natural language components of these pages into UL or converting tables or other structured data into UL.
Answers to the queries include links to web pages containing the information being searched or providing the service being searched, or the system provides information directly in the form of text or spoken answers.
The direct response is accompanied by a link to the source of the information and includes associated data, such as images or tables.
N3. map data, associated systems utilizing map data and location-based searches, represented as UL
Map and location-based data can be represented as UL and thus give them a machine-understandable semantic meaning; this allows the map and location-based data to be used in a much more powerful manner. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of map or location-based data;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications using location-based or map data are implemented.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of map or location-based data;
(b) Automatically processing the structured representation of the data for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications using location-based or map data are implemented.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
map data or location-based data has been automatically translated into a machine-readable language.
The machine learning system is used to generate semantic nodes or paragraphs that represent map data or location-based data.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
Neural network systems utilize recurrent neural networks or LSTM or attention mechanisms or transducers.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
N4. identifying relevant advertisements and news
UL enables advertisements, news articles, or other information items (e.g., on WWW) to be translated into UL, and their semantic meaning can be used for machine processing: this enables automatic assessment of relevance to a particular individual and thus enables personalized advertising and the like.
We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language representing a meaning; a structured machine-readable representation of data relates to a representation of at least some portion of one or more advertisements, news articles, or other information items;
(b) The structured representation of the data is automatically processed to identify advertisements, news articles, or other items of information that are relevant to a particular individual.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; a structured machine-readable representation of data includes representations of at least portions of one or more advertisements, news articles, or other information items;
(b) The structured representation of the data is automatically processed to identify advertisements, news articles, or other items of information that are relevant to a particular individual.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the method also determines advertisements, news articles, or other information items relevant to the user by analyzing semantic nodes that are representations of the user-specific data.
The user-specific data includes one or more of the following related to the user: social media profile, posts, profile information, "likes," web search or web browsing history, natural language dialogue/exchange between the user and the system, where the system stores and remembers information about himself or herself that the user has given.
Advertisements, news articles or other information items related to a particular individual, as well as user-specific data, have been automatically translated into a machine-readable language.
Machine learning systems are used to generate semantic nodes or paragraphs that represent advertisements, news articles, or other information items related to a particular individual, as well as user-specific data.
The machine learning system is a neural network system, such as a deep learning system.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
Aggregation and summary of N5. news
UL enables news from multiple sources (e.g., on WWW) to be translated partially or fully into UL, and their semantic meaning is available for machine processing and summarization: this enables automatic personalization of news summaries etc. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes representations of news from multiple sources;
(b) Automatically processing the structured representation of the data for one or more of the following: generating a summary of news from a plurality of sources; using the news summary to derive facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented to use summaries of news from multiple sources.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes representations of news from multiple sources;
(b) Automatically processing the structured representation of the data for one or more of the following: generating a summary of news from a plurality of sources; using the news summary to derive facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented to use summaries of news from multiple sources.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
News articles related to a particular user are also determined by analyzing semantic nodes that are representations of user-specific data.
The user-specific data includes one or more of the following related to the user: social media profile, posts, profile information, "likes," web search or web browsing history, natural language dialogue/exchange between the user and the system, where the system stores and remembers information about himself or herself that the user has given.
News articles have been automatically translated into machine-readable language.
The machine learning system is used to generate semantic nodes or paragraphs that represent news articles.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
Neural network systems utilize recurrent neural networks or LSTM or attention mechanisms or transducers.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
N6. matching between people using UL
UL implementation accurate and scalable matching search: for example, gender; age, age; and other information related to dating, or marital or friendship matches, or business contact matches, may be translated into UL, and their semantic meaning may be used for machine processing: this results in improved automatic personalized matching. We can generalize to:
A computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of personal information defining one or more of the following attributes of a person: gender, age, information related to appointments or pairing, information related to identifying business contacts, information related to identifying friends;
(b) The structured representation of the data is automatically processed to provide compatibility matches between persons.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of personal information defining one or more of the following attributes of a person: gender, age, information related to appointments or pairing, information related to identifying business contacts, information related to identifying friends;
(b) The structured representation of the data is automatically processed to provide compatibility matches between persons.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
And wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
personal information includes information from conversations in natural language form with the system, where the user's response is translated into a structured machine-readable representation of the data.
Personal information includes information from the output of the machine learning model.
Personal information includes information from reasoning.
Personal information includes information from learning.
N7. identify abuse or unreal posts in social media
UL enables social media posts to be partially or fully translated into UL, and their semantic meaning is available for machine processing: this enables automatic and high-accuracy analysis of compliance with abuse-resistant, spurious, or illegal posts. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of a social media post;
(b) The structured representation of the data is automatically processed to determine whether the post complies with requirements for protection from abuse or illegal posts.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of a social media post;
(b) The structured representation of the data is automatically processed to determine whether the post complies with requirements for protection from abuse or illegal posts.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
processing includes determining whether the social media post is authentic.
Processing includes determining whether the social media post is illegal.
The machine-readable representation of the data further includes at least a partial representation of the requirement to prevent abuse or illegal posts, and processing the representation of the reference requirement.
The process additionally generates a natural language explanation why the social media posts are not compliant.
Processing results that additionally apply a statistical machine learning model to social media posts and use the model.
Analysis of N8. customer reviews
UL enables customer reviews (e.g., of products, companies) to be translated into UL and their semantic meaning is available for machine processing: this enables automatic analysis of customer reviews. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; a structured machine-readable representation of data comprising a representation of customer reviews of a product or service;
(b) The structured representation of the data is automatically processed to analyze customer reviews.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of customer reviews;
(b) The structured representation of the data is automatically processed to analyze customer reviews.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
the system is further configured to automatically answer questions about the product or service via the structured machine-readable representation of the reference data.
The system is further configured to automatically answer general product questions from the customer by referencing the structured machine-readable representation.
The system is further configured to translate some or all of the natural language in the customer reviews into a structured machine-readable representation of the data.
N9. shopping queries and product requests
UL enables product descriptions, user product requests, previous searches by users, social media or shopping histories to be translated into UL, and their semantic meaning can be used for machine processing: this enables automatic analysis of which products best match the user's product request or the user's previous search, social media, or shopping history. We can generalize to:
A computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes representations of product descriptions, user product requests, previous searches by the user, social media, or shopping history;
(b) The structured representation of the data is automatically processed to determine which products best match the user's product request or the user's previous search, social media, or shopping history.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes representations of product descriptions, user product requests, previous searches by the user, social media, or shopping history;
(b) The structured representation of the data is automatically processed to determine which products best match the user's product request or the user's previous search, social media, or shopping history.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
And wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
presenting the product that best matches the user for possible purchase.
The structured representation of the automatically processed data occurs as part of a natural language dialogue with the user regarding what the user expects to purchase.
N.10 job matching
UL enables job descriptions and job seekers' skills and experience to be translated into UL and their semantic meaning can be used for machine processing: this enables automatic analysis of which jobs best match the skill and experience of job seekers over a very wide variety of skills, jobs and contexts, without the need for additional computer code, and these matches are very accurate and interpretable in natural language. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a description of the job and a representation of skill and experience of the job seeker;
(b) The structured representation of the data is automatically processed to determine which jobs best match the skill and experience of the job applicant.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a description of the job and a representation of skill and experience of the job seeker;
(b) The structured representation of the data is automatically processed to determine which jobs best match the skill and experience of the job applicant.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
a system operable to match candidates with vacant work and comprising at least one data store comprising: a plurality of candidate profiles, wherein at least some portions of at least some of the candidate profiles are in a structured machine-readable form encoding meanings; a plurality of work specifications for the null character, wherein at least some portions of at least some of the work specifications are stored in a structured machine-readable form encoding meaning, and wherein the system is further operable to match the plurality of candidate profiles with the plurality of work specifications to identify high confidence matches between the candidates and the null character.
The structured machine-readable form is a language that represents meaning by creating a combination of identifiers, and wherein at least some of the identifiers represent human skills and experience.
The at least one data store further stores a representation of the desired character of the candidate at least in part in a structured machine-readable form, and wherein the system is further operable to match the empty character with the representation of the desired character of the candidate to improve the match between the candidate and the empty character.
The system is further operable to send a push notification to the mobile device when a high confidence match is found.
The system is further operable to explain how the candidate matches the role by generating an explanation of which parts of the work specification match the candidate's skills and experience.
Interpreted in natural language.
The system is operable to match requirements in the job specification with the skills and experience of the candidate, wherein there are no common keywords between the candidate resume and relevant parts of the natural language version of the job specification.
The system is operable to perform a series of logical reasoning steps in order to match the skill or experience of the candidate with the requirements in the work specification.
N.11 horizontal health application
UL support creates a horizontal health application that is capable of integrating an extremely wide variety of heterogeneous health data. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of personal health or medical data;
(b) The structured representation is automatically processed to analyze the personal health or medical data.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of personal health or medical data;
(b) The structured representation is automatically processed to analyze the personal health or medical data.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Optional features:
a set of broad health data managing one or more persons, and wherein at least some of the health data is represented in a structured machine-readable form encoding meanings stored in one or more data stores.
The health data comprises nutritional data about food or beverage that has been consumed by at least one of the one or more people.
Nutritional data includes data representing uncertainty about the amount of consumed items or nutritional information or ingredients.
Health data includes data about: the results of blood tests or measurements or body composition or activity information or genetic data or microbiome data or bowel movement events or sleep data or exercise data or activity data or disease symptoms or human moods or menses or drug intake or medical conditions or data from any wearable device.
Implementing a dialogue with one or more users via text.
Enabling selected others to talk to one or more users and view relevant health data.
Create a graph of specific types of health data where the user can see how the different data are related.
Analyze the health data to reveal insights related to the health of a particular user.
Including potential dietary intolerance or sleep affecting behavior.
The elements of the health data are combined to calculate additional health data items that are not already present in the health data.
The additional health data item is an estimate of the caffeine present in the user's body at a particular time.
N.12 accountant
UL enables financial or accounting information to be translated into UL, and their semantic meaning can be used for machine processing: this enables automatic analysis of financial or accounting information. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of financial or accounting information;
(b) The structured representation is automatically processed to analyze the personal financial or accounting information.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes a representation of financial or accounting information;
(b) The structured representation is automatically processed to analyze financial or accounting information.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
processing accounting data of at least one business, and at least some of the accounting data being represented in a structured machine-readable format encoding real world meanings stored within one or more data stores.
The structured machine-readable format comprises a combination of identifiers, wherein at least some of the identifiers represent real world entities related to the activity of the at least one enterprise, and wherein further meaning is encoded according to a selection of the combination of identifiers.
Automatically presenting accounting data in a plurality of different accounting standards.
Generating answers to questions regarding the activities of the at least one enterprise.
N.13 Voice Assistant/chat robot
Natural language for a voice assistant or chat robot can be translated into UL, and UL representation is used internally to answer questions, dialogue, or take action. This horizontal representation enables the ability to more easily extend the capabilities of a voice assistant or chat robot and makes the system easier to work with a large number of other natural languages, since only the translation steps need to be changed. We can generalize to:
a computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes: a representation of user speech or text input for a human-machine interface;
(b) The structured representation is automatically processed to analyze user speech or text input.
A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes: a representation of user speech or text input for a human-machine interface;
(b) The structured representation is automatically processed to analyze user speech or text input for the human-machine interface.
In a preferred embodiment, a structured machine-readable representation of data conforming to a machine-readable language includes semantic nodes and paragraphs;
and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is derived from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Other optional features:
privacy preserving mode
The first wake word initiates the process and then enters a privacy preserving state requiring a second wake word, and wherein the second wake word is long enough or unusual that it is much less likely to be misidentified relative to the first wake word.
Multiple different voice assistants
Delivering the experience of the plurality of different voice assistants to the plurality of users and at least one data store containing personality information that determines personalities of at least some of the plurality of different voice assistants.
The personality information includes information about: the gender or name of the voice assistant or the voice or the mood or emotional response or formality or location on an outside-in vector scale or location on any of the meissbrix scales or classification in meissbrix class or personality test or visual appearance.
There is at least one set of machine-readable principles that represent the purpose and rules of guiding at least some of the plurality of voice assistants, and then taking action in accordance with the principles by referencing the principles.
At least one set of machine-readable principles is a plurality of sets of machine-readable principles, and wherein a selected one of a plurality of different voice assistants is mapped to a selected one of the plurality of sets of machine-readable principles, wherein the different voice assistants are driven by different principles.
Private user data is only accessible by selected ones of the plurality of different voice assistants.
Device type
The computer-based system is configured to be a voice assistant.
The computer-based system is a voice assistant device configured to control items in a home, automobile, or other environment using user speech or text input.
The computer-based system is a voice assistant device configured to run on a smart phone, notebook, smart speaker, or other electronic device.
The computer-based system is a voice assistant device configured to run at least partially on a cloud or central server and at least partially on an edge device.
For each of the use case concepts N1-N13, the following applies:
methods and systems use a single grammar term (such as brackets) to disambiguate combinations of nodes, as defined in concept a.
Methods and systems use shared syntax across fact statements, queries, and reasoning, as defined in concept B.
Methods and systems use nesting of nodes as defined in concept C.
The method and system use ID selection from an address space that is large enough to enable the user to select a new identifier, while the risk of selecting to a previously assigned identifier is negligible, as defined in concept D.
The method and system do not impose any restrictions on which clients are allowed to generate semantic nodes or paragraphs, as defined in concept E.
Methods and systems use comprehensive generic language concepts, as defined in concept F.
The method and system include question answering as defined in concept G.
Methods and systems include learning, as defined in concept H.
Methods and systems include translation, as defined in concept I.
Methods and systems include semantic node resolution as defined in concept J.
Methods and systems include translation between natural languages, as defined in concept K.
The method and system are for a voice assistant as defined in concept L.
Method and system usage principle, as defined in concept M.
Optional features applicable to all concepts A-N
Note that any occurrence of a "semantic node" or "paragraph" may be summarized below as a "structured machine-readable representation" as well as a "machine representation". Similarly, any occurrence of a "structured machine-readable representation" or equivalent may be summarized as a "machine representation". In the appended claims, we use the term "machine representation" for brevity.
Simple syntax
The structured machine-readable representation includes a single grammar term to disambiguate the meaning of the structured representation of data.
Nesting of structured machine-readable representations of single grammar term representation data for disambiguation of meaning.
A single grammar term for disambiguating meaning represents the nesting of semantic nodes and paragraphs to arbitrary depths.
The single grammar term used to disambiguate the meaning of the combination is brackets or brackets.
A single grammar term for disambiguating the meaning of the combination is the only grammar term for disambiguating the meaning of the combination.
A single grammar term for disambiguating the meaning of the combination is the primary grammar term for disambiguating the meaning of the combination.
Syntax applies to all nodes and combinations of nodes.
Syntax is a simple unambiguous syntax that includes nesting of nodes.
Syntax is a simple unambiguous syntax that includes nesting of nodes to arbitrary depths.
Syntax is a simple unambiguous syntax in which semantic nodes can only be combined in nested combinations.
Syntax allows expressions to nest indefinitely, allowing users to define concepts as a hierarchy of semantic nodes along with contextual information about the concepts.
The combining nodes may contain any limited number of semantic nodes, and the semantic nodes within these combining nodes may also be combining nodes that create any level of nesting.
Semantic links (such as ISA) between nodes are themselves semantic nodes.
The syntax of the machine-readable language is applicable to the combination of semantic nodes representing factual statements, query statements, and inference statements.
The syntax of the structured machine-readable representation of the data conforms or substantially conforms to the production grammar "< passage >: = < id > | < passage >: = (< passage > < passage >)", where "< passage >" represents zero or one or more further paragraphs, and where < id > is an identifier of the semantic node.
Node meaning
The machine-readable language is a general-purpose language for which anything that is expressible in essentially natural language can be expressed as a structured machine-readable representation of data or a combination of structured machine-readable representations of data.
The structured machine-readable representation of the data represents a particular entity, such as a word, concept, or other thing, and once generated, uniquely identifies the particular word, concept, or other thing in a common language.
Ordered or partially ordered sets of structured machine-readable representations of data capture specific meaning or semantic content.
The meaning of the structured machine-readable representation of data comes from statements written in a machine-readable language.
The meaning of the structured machine-readable representation of data comes from the structured machine-readable representation of other data representing what has been said about the structured machine-readable representation of data.
A structured machine-readable representation of data representing an entity encodes the semantic meaning of the entity by linking to a structured machine-readable representation of data of related words, concepts, other terms, or logical processes.
The structured machine-readable representation of the combined data generates new words, concepts or other terms in a machine-readable language having new meaning or semantic content.
Machine-readable languages are understandable to human users, corresponding to equivalent statements in natural language.
Generating nodes
Once defined, the semantic node has an identifier or ID.
Semantic nodes are identified with UUIDs.
The identifier is selected from an address space that is large enough to enable a user to select a new identifier independent of other users without repetition.
The identifier is selected from an address space that is large enough to enable the user to select a new identifier with negligible risk of selecting to a previously assigned identifier.
The ID is UUID.
The ID is a 128-bit version 4UUID (RFC 4122) with hyphenated lower case syntax.
The ID is a UUID or a string, such as a Unicode string.
The string may itself be marked as a structured machine-readable representation of the data and its meaning is strictly speaking the string itself and any natural language meaning contained within the string is not part of the meaning of the string.
The string may label itself as a semantic node and its meaning is strictly speaking just the string itself and any natural language meaning contained within the string is not part of the meaning of the string.
The string is represented by an ID as an additional identifier.
The string is represented as a UUID or other numeric ID and a separate paragraph links the string to that numeric ID to provide its meaning.
Two identical strings used as semantic nodes have the meaning common to the string.
Any user can create its own semantic node with its own local meaning by picking the unused identifier.
Any user can create his own identifier for the semantic node even if another identifier is already used for the semantic node.
Any user is free to define his own meaning for the combination of semantic nodes.
There may be multiple different semantic nodes for the same specific word, concept, or other things.
Any user that chooses to create a machine representation (such as a paragraph) that uses the shared semantic nodes also expresses the same meaning by combining them, so that the meaning brought by combining the shared semantic nodes is generic.
There may be multiple different structured machine-readable representations of data for the same particular word, concept, or other thing.
Any user that chooses to create a paragraph of the structured machine-readable representation using the shared data also expresses the same meaning by combining them, so that the meaning brought by combining the structured machine-readable representations of the shared data is generic.
Each meaning of each word in the dictionary is represented by a semantic node.
The machine learning system generates paragraphs by autonomous learning from natural language documents or dialogs.
Paragraphs are derived from machine analysis of natural language documents (such as WWW pages or dialogues).
Semantic nodes are structured machine-readable representations of data that once defined have identifiers so that they can be referenced in a machine-readable language.
"shared ID" is an ID used by more than one user; "private ID" or "local ID" is similarly an ID that is used by only one user and is not issued or exposed to other users; the "common ID" is an ID that the user has used in the UL that each user can see.
A paragraph is a combination of semantic nodes that express meaning and is a unique nested structure.
Semantic nodes in the infinite class may be represented as a combination of multiple other nodes.
Extensibility and method for making same
The machine-readable language is extensible in that any natural language word, concept, or other thing can be represented by a structured machine-readable representation of data.
The machine-readable language is extensible in that there is no limit to which users can create a structured machine-readable representation of the data or related identifiers.
Problem(s)
Expressing the problem in a machine readable language with: a paragraph including a node, the node identifying the paragraph as a question; a language representing zero or one or more unknown entities requested within the semantics of the problem; and a language that represents the semantics of the problem and references zero or one or more unknown entities.
Expressing the problem in a machine readable language with: a paragraph of the form (Question < unowns >) (< passage >) where Question is a semantic node and < unowns > is a list of zero or one or more semantic nodes representing unknown values (similar in meaning to the letters of the alphabet in algebra), and where < passage > is where unknown items are used to express the content being asked.
Generating a response to a query includes three operations, namely: matching with a structured machine-readable representation of the data in the storage area, such as a paragraph, obtaining and executing the computing unit, and obtaining the inferred paragraph.
The question is represented in memory as a structured machine-readable representation of the data, and the representation of the question, the structured machine-readable representation of the data previously stored in the memory storage area, the computing unit, and the inference paragraph are all represented in substantially the same machine-readable language.
Reasoning
Inference is the generation of machine-readable language from other machine-readable languages using inference steps, which are represented as paragraphs that represent the semantics of the inference steps.
Inference is made by answering a series of one or more queries to see if the inference step is valid.
Inference is performed by answering a series of one or more queries to generate the results required for the result of the inference.
The paragraph represents detailed information of the computing unit required to select and run the computing unit, namely: define what the computing unit can do, how the computing unit is run, and how the results are interpreted.
The step of obtaining and executing one or more initial inferencing paragraphs returns other paragraphs with unknown terms that need to be processed, and the result of this processing is a junction tree that is used to give the results of the initial paragraphs.
The process of storing the junction tree and these other paragraphs with unknown terms occurs in parallel, allowing data acquisition and exploration of reasoning to be parallelized.
Once all paragraphs have been processed to a given maximum inference depth, a second non-parallelization step is used to traverse the tree of processed paragraphs and unknown mappings to find a valid answer.
Processing each paragraph in the paragraph list to identify a valid mapping from the paragraph memory store and the computing unit, wherein the valid mapping of the paragraph list is a mapping of: all unknown items have values and there is no contradictory mapping between paragraphs in the list.
The step of identifying valid mappings recursively browses the data and finds all valid mappings that can be returned as an answer to the initial question.
At least some of the paragraphs that have been generated from reasoning or computation are stored in a paragraph memory store so that they are available for faster processing in the future.
A history of these generated paragraphs is also stored so that changes in the level of trust in the paragraphs used to generate the paragraphs can be extended to give trust to these generated paragraphs.
A history of these generated paragraphs is also stored to enable removal of the generated paragraphs when the trusted state for one or more of them is changed.
When a new paragraph is added to the paragraph memory storage area, the new paragraph is assigned a low initial trust value when added by a normal user and a higher start value when added by a privileged user.
Signals from applications of the system or method are stored in association with paragraphs used by the applications in order to keep track of values of the paragraphs.
A value vector is assigned to a paragraph, where the number at each index represents the different qualities of the paragraph.
Different qualities include authenticity, practicality and efficiency.
The process of using paragraphs utilizes a priority vector, where the numbers at each index indicate their priority to the value.
The total value for the paragraph of the procedure can then be obtained from the dot product of the vectors.
The inference engine performs experiments with the high value paragraphs and the lower value paragraphs to answer questions, and then monitors the answers provided by the inference engine for any signal indicating whether the lower value paragraph has a positive or negative impact on the answer, and then this information is fed back into the automated monitoring process which re-evaluates the values of the paragraphs with the new signal.
The automated monitoring process automatically tests paragraphs to determine if they should be used for question answering.
Structured machine-readable representations of data previously stored in memory storage areas have been monitored in an automated manner.
The problem is the result of the following translations: the natural language asked by the user is translated into a semantically substantially equivalent representation in a machine-readable language.
The response to the question is then translated into semantically equivalent natural language and presented to one or more users.
The problem is the result of the following translations: the method includes translating a question spoken by a user in natural language into a semantically substantially equivalent representation in a machine-readable language, and then playing a spoken answer to the user, wherein the spoken answer is a result of translating a response to the question into natural language.
Calculation unit
The computing unit represents the individual computing power available for reasoning and other intents.
The computation unit is a semantic node.
The combination of paragraphs or semantic nodes represents the detailed information of the computing units required to select and run the computing units, namely: define what the computing unit can do, how the computing unit is run, and how the results are interpreted.
The calculation unit is suitably used during reasoning.
Learning
The new information learned is represented in a structured machine-readable representation of data conforming to a machine-readable language.
Learning the new information is obtained from automatically processing a structured machine-readable representation of the data to obtain or learn the new information, and the new information itself is represented as a structured machine-readable representation of the data stored in memory.
Learning new information is obtained from a machine learning system that generates classifications or predictions or other outputs that are represented as paragraphs.
The machine learning system processes semantic nodes and paragraphs to obtain or learn new information.
New information is generated by automatically processing semantic nodes and paragraphs to answer questions.
A method according to any of the preceding claims, wherein the question is represented as one or more machine representations (such as paragraphs) and the response to the question is automatically generated using one or more of the following steps: (i) Matching the problem with a machine representation previously stored in a memory storage area; (ii) Acquiring and executing one or more computing units, wherein the computing units represent computing power associated with answering questions; (iii) One or more inference machine representations (such as inference paragraphs) are obtained and executed, which are machine representations that represent the semantics of potentially applicable inference steps related to answering questions.
New information, represented as semantic nodes or paragraphs, is stored and used to improve the performance of learning new facts.
New information, represented as semantic nodes or paragraphs, is stored and used to improve the reasoning step.
New information represented as semantic nodes or paragraphs is stored and used to interpret or describe the new information in natural language.
New information represented as semantic nodes or paragraphs is stored and used for text or spoken dialog with a human user.
Learning new information occurs from conversations with or other natural language provided by a human user, where the natural language provided by the user in spoken or written form is translated into semantic nodes and paragraphs, and then storing and using the new information represented by these semantic nodes and paragraphs.
Learning from reasoning, where semantic nodes and paragraphs generated from a series of reasoning steps are stored and utilized.
Learning from natural language, where the resulting semantic nodes or paragraphs are then used by an application by translating all or part of the document source of the natural language, such as a web page, scientific paper, or other article, into semantic nodes and paragraphs.
Using a non-document source that includes natural language including an audio recording or video of human speech, and first creating a text transcription of the speech recording using speech recognition techniques, and then translating the text transcription into semantic nodes or paragraphs.
Machine learning systems are used to analyze document and non-document data and create paragraphs from the data.
Machine learning
The machine learning system is used to generate semantic nodes or paragraphs that represent words or word sequences in natural language form.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language comprising semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
The machine learning (e.g., neural network) system is a switching converter feed-forward neural network system.
A machine learning system (e.g., neural architecture) includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
When automatically translating a word sequence expressed in natural language (such as speech or text input) into a machine-readable language, the structure of the word sequence is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence (such as speech or text input) into a machine-readable language is accomplished by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence (such as speech or text input) into a machine-readable language is accomplished by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
Automatically evaluating the semantic impact of the change on words or word sequences (such as speech or text input) in the translated natural language phrasal in natural language form to determine if a known or substantially true example of a sufficiently accurate semantic node or paragraph can be used.
Semantic nodes or paragraphs that represent words or word sequences (such as speech or text input) provide machine-readable representations of the meaning of the words or word sequences.
Semantic nodes or paragraphs representing words or word sequences (such as speech or text input) are processed by a computer-based system for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
Semantic nodes or paragraphs representing words or word sequences (such as speech or text input) are processed by a computer-based system to generate human-readable output.
Human-readable output includes one or more of the following: answers to questions expressed in natural language; an inference statement explaining how the system concludes; a learning statement explaining what the system has learned; responses in human-machine interactions.
Neural networks are trained end-to-end to directly transform audio or video data into semantic nodes and paragraphs.
Natural language based learning combined with statistical machine learning to optimize translation of document and non-document data (such as speech or text input) into semantic nodes and paragraphs.
The machine learning system is used to generate semantic nodes or paragraphs.
The neural network system is a switching converter feed-forward neural network system.
By assigning semantic nodes to identifiers in the structured data and writing semantic nodes and paragraphs that correspond to the meaning of the structured data, the structured data (such as the contents of tables found in documents or on networks, electronic tables or relationships, graphs, or other database contents) is transformed into semantic nodes and paragraphs.
Learning from analysis of other data where the data is algorithmically processed and the results of the processing are represented in terms of semantic nodes and paragraphs.
Translation to and from UL
Receive a word or word sequence in natural language form and automatically translate the word or word sequence into machine-readable language by identifying or generating a structured machine-readable representation that semantically represents the meaning of the word or word sequence.
The machine learning system is used to generate semantic nodes or paragraphs that represent words or word sequences in natural language form.
The machine learning system is a neural network system, such as a deep learning system.
Neural architecture is used to generate a machine-readable language.
Neural architecture utilizes recurrent neural networks or LSTM or attention mechanisms or transducers.
Machine learning systems have been trained from training data comprising a natural language and a corresponding structured machine-readable representation, such as a machine-readable language containing semantic nodes and paragraphs.
Training a segment of natural language through a sequence-to-sequence neural architecture according to training data comprising the natural language and a corresponding structured representation encoding the meaning.
The neural network system is a switching converter feed-forward neural network system.
The neural architecture includes an encoder and a decoder, and uses a bundle search to remove invalid semantic representations during decoding of the semantic representations from the decoder.
The word or word sequence in natural language is a question and the question is answered with reference to the semantic representation.
The word or word sequence in natural language is one or more documents, and the semantic representation of the one or more documents is used to answer the question.
Inference of the reference semantic representation yields a further new representation.
When automatically translating word sequences expressed in natural language into machine-readable language, the structure of the word sequences is compared to known machine-readable language structures in memory to identify similarity.
Automatic translation of a word or word sequence into a machine-readable language is achieved by referencing a memory area of correct translation between a previously recognized natural language and the machine-readable language.
Automatic translation of a word or word sequence into a machine-readable language is achieved by utilizing a functional pipeline that transforms the word or word sequence into a series of intermediate forms.
Automatically evaluating the semantic impact of the change on the word or word sequence in the translated natural language phrasing of the natural language form to determine whether a sufficiently accurate known or substantially true example of the semantic node or paragraph can be used.
Semantic nodes or paragraphs representing words or word sequences provide machine readable representations of the meaning of the words or word sequences.
Semantic nodes or paragraphs representing words or word sequences are processed by a computer-based system for one or more of the following: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
Semantic nodes or paragraphs representing words or word sequences are processed by a computer-based system to generate human-readable output.
Human-readable output includes one or more of the following: answers to questions expressed in natural language; an inference statement explaining how the system concludes; a learning statement explaining what the system has learned; responses in human-machine interactions.
The system is further operable to automatically translate from the structured machine-readable representation to natural language.
When translating from a structured machine-readable representation to natural language, the system changes the generated translation between semantically substantially equivalent alternatives, creating a changing and fresh response for the benefit of the human user.
Automatic translation of a word or word sequence into a machine-readable language is accomplished by referencing the context of information related to generating the correct translation.
Semantic node resolution
Providing a service operable to receive a description of an entity and return one or more identifiers for a structured machine-readable representation of data corresponding to the entity, such that a user can use the shared identifier for the entity.
The description is described in part or in whole in a machine-readable language.
The description is written in part or in whole in one or more natural languages.
The service compares the description of the proposed semantic node or paragraph with the available information about the existing entity to determine if there is a match.
The service probabilistically determines whether there is a match.
The service additionally returns a match probability and one or more identifiers.
If no match is found, the service returns a new identifier.
Principle of
The structured representation of data includes one or more rules, statements, or other rules defining targets or motivations for the representation also using the structured representation of data; analyzing the potential actions to determine if executing the actions will optimize or otherwise affect achievement or implementation of those principles, statements, or other rules; and automatically selects, decides or performs those actions only if the actions optimize or otherwise positively impact the achievement or implementation of those principles, statements, or other rules.
Automatically proposing actions conforming to rules, statements, or other rules ("rules") by referencing the rules.
Actions include communicating with the user in written form.
The action includes communicating with the user in verbal form.
The principle includes at least one metric, such as user happiness, that the system should attempt to maximize.
The principle includes at least one metric that the system should attempt to minimize, such as user unfortunately.
The principle comprises at least one rule of actions that the system cannot do.
The system is further operable to avoid doing actions that the system cannot do by reference to the principle.
The principle includes: at least one suggestion of what action to do under defined conditions.
The principles include sub-principles, which are principles related to other principles or are more specific examples of another principle.
Actions include accessing other remote computer systems.
The actions include changing the state of a device linked to the system via a network.
The action includes initiating a verbal interaction with a human.
The data store contains a machine-readable representation of the world encoding meaning, and wherein the system is further operable to reason with reference to the machine-readable representation of the world to select the action that complies with the rules.
The machine-readable representation of the world includes a representation of the effective inference steps, and wherein the system is further operable to infer as the representation of the effective inference steps.
The machine-readable representation of the world comprises a representation of computing power available to the system, and wherein the system is further operable to utilize the computing power by referencing the machine-readable representation.
The system is operable to learn and augment a machine-readable representation of the world.
The system is operable to learn from communications with at least one user.
The system is operable to learn from at least one external sensor connected to the system via a network.
The machine-readable principles are represented at least in part by a combination of identifiers, and wherein at least some of the identifiers represent concepts corresponding to real world things.
The system is further operable to receive a description of the concept from the remote system and use the description to return an identifier that may mean the concept.
The system is operable to continue reasoning in a manner that results in an in-principle action.
The system is operable to answer questions from a human user regarding the principle.
The computer system includes long term memory; a short term memory; a rules store containing machine readable rules representing rules of the guidance system, and wherein the computer system is operable to receive events and utilize the events, the contents of the long term memory, the contents of the short term memory, and the rules to perform rules-compliant actions.
The computer system includes: means for generating a candidate action; the reference principle determines whether to execute the candidate action and execute the action.
The question of answering a question of a human user includes two actions-generating a response to the question and transmitting the response to the human user.
The event comprises a transmission from at least one user, and wherein the action comprises: to at least one user.
The system is further operable to learn and store the content of the system learning to long term memory.
The computer system is not operable to change the principle.
The principle includes a principle that prohibits actions that may lead to changing the principle.
The system is further operable to perform an independent check for each potential action against the principle and to discard the potential action if the independent check finds that the potential action is not compatible with the principle.
The computer system is further operable to actively exclude knowledge about itself for determining the action.
The potential actions are autonomously generated by the computer-based system.
The potential actions are autonomously generated by the computer-based system as output to process inputs, such as audio or text.
The potential actions are generated autonomously with a process that operates substantially continuously.
The potential actions are generated autonomously without any external trigger event for initiating the process or user instruction or action for initiating the process.
If the potential action optimizes or otherwise positively affects the achievement or implementation of those principles, statements, or other rules, then the potential action is automatically performed.

Claims (206)

1. A computer-implemented method for automatically analyzing or using data, comprising the steps of:
(a) Storing in memory a structured machine-readable representation ("machine representation") of data conforming to a machine-readable language; the machine representation includes: a representation of user speech or text input for a human-machine interface;
(b) The machine representation is automatically processed to analyze the user speech or text input.
2. The method of claim 1, when implemented in a voice assistant or chat robot.
3. The method of claim 1, wherein the machine representation includes semantic nodes and paragraphs; and wherein the semantic node represents an entity and itself is represented by an identifier; and a paragraph is either (i) a semantic node or (ii) a combination of semantic nodes; and wherein the machine-readable meaning is from the selection of semantic nodes and the manner in which the semantic nodes are combined and ordered into paragraphs.
Translation to and from UL
4. The method of any preceding claim, wherein the user speech or text input is natural language and is received and automatically translated into the machine readable language by identifying or generating a machine representation that semantically represents the meaning of the input.
5. The method of any of the preceding claims, wherein the machine representation representing the input is generated using a machine learning system.
6. The method of claim 5, wherein the machine learning system is a neural network system, such as a deep learning system.
7. The method of claim 5 or 6, wherein the machine readable language is generated using a neural architecture.
8. The method of claim 7, wherein the neural architecture utilizes a recurrent neural network or LSTM or attention mechanism or transformer.
9. The method of any of claims 5 to 8, wherein the machine learning system has been trained from training data comprising natural language and corresponding machine representations.
10. The method of any of claims 5 to 9, wherein a segment of natural language input is passed through a sequence-to-sequence neural architecture trained in accordance with training data comprising natural language and corresponding machine representations.
11. The method of any of claims 5 to 10, wherein the machine learning system is a switching converter feed forward neural network system.
12. The method of any of claims 5 to 11, wherein the machine learning system comprises an encoder and a decoder, and wherein a bundling search is used to remove invalid semantic representations during decoding of the machine representation from the decoder.
13. The method of any preceding claim, wherein the speech or text input is a question and the question is answered with reference to the machine representation.
14. A method according to any preceding claim, wherein the speech or text input in natural language is one or more documents and the machine representation of the one or more documents is used to answer questions.
15. A method according to any of the preceding claims, wherein reasoning about the machine representation yields a further new machine representation.
16. The method of any preceding claim, wherein when a sequence of speech or text inputs expressed in the natural language is automatically translated into the machine readable language, the structure of the sequence of speech or text inputs is compared to known machine readable language structures in the memory to identify similarity.
17. The method of any of the preceding claims, wherein automatically translating the speech or text input into the machine-readable language is accomplished by referencing a previously identified storage area for correct translation between the natural language and the machine-readable language.
18. The method of any of the preceding claims, wherein automatically translating the speech or text input into the machine readable language is accomplished by utilizing a functional pipeline that transforms the speech or text input into a series of intermediate forms.
19. The method of any of the preceding claims, wherein the semantic impact of a change on the speech or text input in natural language form is automatically evaluated to determine whether a known or substantially true example of a sufficiently accurate machine representation can be used.
20. The method of any of the preceding claims, wherein the machine representation represents the speech or text input and is processed by a computer-based system for one or more of: deriving facts or relationships; reasoning; learning; translation; answer questions; processing natural language content; human-computer interaction is realized; representation and regulatory rules or guidelines; one or more vertical applications are implemented.
21. The method of any of the preceding claims, wherein the machine representation representing the speech or text input is processed by a computer-based system to generate a human-readable output.
22. The method of any of the preceding claims, wherein the human-readable output comprises one or more of: answers to questions expressed in the natural language; an inference statement explaining how the system concludes; a learning statement that explains what the system has learned; responses in human-machine interactions.
23. The method of any of the preceding claims, wherein a machine representation is automatically translated into the natural language.
24. A method according to any of the preceding claims, wherein the system changes the generated translations between semantically substantially equivalent alternatives when translating from the machine representation to the natural language, thereby creating a varying and fresh response for the benefit of a human user.
25. The method of any of the preceding claims, wherein automatically translating the speech or text input into the machine-readable language is accomplished by referencing a context of information related to generating a correct translation.
26. The method according to any of the preceding claims, wherein a neural network that has been trained end-to-end is utilized to directly convert audio or video data into a machine representation.
27. The method of any of the preceding claims, wherein natural language based learning is combined with statistical machine learning to optimize translation of speech or text input into machine representations.
28. A method according to any one of the preceding claims, wherein learning is performed from analysis of other data, wherein the data is algorithmically processed and the results of the processing are represented in a machine representation.
Simple syntax
29. The method of any of the preceding claims, wherein the machine representation comprises a single grammar term to disambiguate the meaning of the machine representation.
30. The method of claim 29, wherein the single grammar term for disambiguating meaning represents nesting of the machine representation.
31. The method of claim 29 or 30, wherein the single grammar term for disambiguating meaning represents a semantic node and nesting of paragraphs to arbitrary depths.
32. The method of any of claims 29 to 31, wherein the single grammatical item used to disambiguate the meaning of the combination is a bracket or bracket.
33. The method of any of claims 29 to 32, wherein the single grammar term for disambiguating the meaning of the combination is the only grammar term for disambiguating the meaning of the combination.
34. The method of any of claims 29 to 33, wherein the single grammar term for disambiguating the meaning of the combination is a primary grammar term for disambiguating the meaning of the combination.
35. The method of any of claims 29 to 34, wherein there is a syntax applicable to all nodes and combinations of nodes.
36. The method of claim 35, wherein the syntax is a nested simple unambiguous syntax including semantic nodes.
37. The method of claim 36, wherein the syntax is a nested simple unambiguous syntax including semantic nodes to arbitrary depths.
38. The method of any of the preceding claims 36-37, wherein the syntax is a simple unambiguous syntax, wherein semantic nodes can only be combined in nested combinations.
39. The method of any of the preceding claims 36-38, wherein the syntax allows expressions to be infinitely nested to allow a user to define concepts as a hierarchy of semantic nodes along with contextual information about the concepts.
40. A method according to claim 3 and any one of the preceding claims when dependent on claim 3, wherein a combined node can contain any limited number of semantic nodes and the semantic nodes within the combined node can also be combined nodes creating any level of nesting.
41. A method according to claim 3 and any one of the preceding claims when dependent on claim 3, wherein the semantic links between nodes such as ISA are themselves semantic nodes.
42. The method of claim 3 and any one of the preceding claims when dependent on claim 3, wherein the syntax for the machine-readable language is adapted to represent a combination of semantic nodes of factual statements, query statements, and inference statements.
43. The method of any of the preceding claims 29-42, wherein the syntax of the machine representation conforms or substantially conforms to a production grammar "< passage >: = < id > | < passage >: = (< passage > < passage >)", wherein "< passage >" represents zero, one or more further paragraphs, and wherein < id > is an identifier of a semantic node.
Node meaning
44. The method of any of the preceding claims, wherein the machine-readable language is a general-purpose language for which anything that can be expressed in substantially natural language can be expressed as a machine representation or a combination of machine representations.
45. A method according to any one of the preceding claims, wherein a machine representation represents a specific entity, such as a word, concept or other thing, and once generated, the machine representation uniquely identifies the specific word, concept or other thing in the machine readable language.
46. The method of any of the preceding claims, wherein the ordered or partially ordered set of machine representations captures specific meaning or semantic content.
47. A method according to any preceding claim, wherein the meaning of a machine representation is from a statement written in the machine-readable language.
48. A method according to any of the preceding claims, wherein the meaning of a machine representation is from other machine representations representing something that has been said about the machine representation.
49. A method according to any of the preceding claims, wherein the machine representation representing an entity encodes the semantic meaning of the entity by linking to a machine representation of related words, concepts, other terms or logical processes.
50. The method of any of the preceding claims, wherein combining machine representations generates new words, concepts or other terms in the machine-readable language having new meaning or semantic content.
51. The method of any of the preceding claims, wherein the machine readable language is understandable to a human user, corresponding to an equivalent statement in natural language.
Generating nodes
52. The method according to any of the preceding claims, wherein a machine representation, such as a semantic node, has an identifier or ID once defined.
53. A method as defined in claim 52, wherein the machine representation includes a plurality of identifiers selected from an address space that is large enough to enable a user to select a new identifier while the risk of selecting a previously assigned identifier is negligible.
54. A method according to any of the preceding claims 52-53, wherein the machine representation comprises a plurality of identifiers selected from an address space that is large enough to enable a user to select a new identifier independently of other users without repetition.
55. The method of any one of claims 52 to 54, wherein the ID is a UUID.
56. The method of any of claims 52 to 55, wherein the ID is a 128-bit version 4UUID (RFC 4122) with hyphenated lower case syntax.
57. A method according to any one of claims 52 to 56, wherein the ID is a UUID or a string, such as a Unicode string.
58. The method of claim 57 wherein a string can mark itself as a machine representation and its meaning is strictly speaking just the string itself and any natural language meaning contained within the string is not part of the meaning of the string.
59. The method of any of the preceding claims 57-58, wherein the string is represented by an ID as an additional identifier.
60. The method of any of the preceding claims 57-59, wherein a string is represented as a UUID or other numerical ID, and a separate paragraph links the string to the numerical ID to provide its meaning.
61. The method according to any of the preceding claims 57-60, wherein two identical strings used as machine representations such as semantic nodes have a common meaning with the strings.
62. A method according to any of claims 52 to 61, wherein any user is able to create his own machine representation, such as a semantic node, with his own local meaning by picking up unused identifiers.
63. A method according to any one of claims 52 to 62, wherein any user is able to create his own identifier for a semantic node even though another identifier has been used for the semantic node.
64. A method according to any of the preceding claims, wherein any user is free to define his own meaning for a combination of machine representations such as semantic nodes.
65. A method according to any of the preceding claims, wherein there can be a plurality of different machine representations such as semantic nodes for the same specific word, concept or other.
66. The method according to any of the preceding claims, wherein any user selecting to create a paragraph using a shared machine representation such as a semantic node also expresses the same meaning by combining the shared machine representations such that the meaning brought by combining the shared machine representations is generic.
67. A method according to any of the preceding claims, wherein there are a plurality of different machine representations for the same specific word, concept or other.
68. The method of any of the preceding claims, wherein any user selecting to create a combination of machine representations using a shared machine representation also expresses the same meaning by combining the shared machine representations such that the meaning brought by combining the shared machine representations is generic.
69. The method according to any of the preceding claims, wherein each meaning of each word in the dictionary is represented by a machine representation such as a semantic node.
70. The method of any of the preceding claims, wherein the machine learning system generates a machine representation such as a paragraph by autonomous learning from natural language documents or dialogs.
71. The method according to any of the preceding claims, wherein the machine representation such as a paragraph is derived from a machine analysis of natural language documents such as WWW pages or dialogs.
72. The method according to any of the preceding claims, wherein a machine representation such as a semantic node is a structured machine readable representation of data, once defined, having an identifier so as to be capable of being referenced in the machine readable language.
73. The method of any one of the preceding claims, wherein a "shared ID" is an ID used by more than one user; "private ID" or "local ID" is similarly an ID that is used by only one user and is not issued or exposed to other users; the "common ID" is an ID that the user has used in the UL that each user can see.
74. The method according to any of the preceding claims, wherein machine representations such as semantic nodes in infinite classes are represented as a combination of a plurality of other machine representations such as semantic nodes.
Extensibility and method for making same
75. The method of any of the preceding claims, wherein the machine-readable language is extensible in that any natural language word, concept, or other thing can be represented by a machine representation.
76. The method of any of the preceding claims, wherein the machine-readable language is extensible in that there is no limit as to which users can create machine representations or related identifiers.
Problem(s)
77. The method of any of the preceding claims, wherein a question is represented in the machine-readable language with: (i) a machine representation comprising a machine representation identifying a problem; (ii) A language representing zero or one or more unknown entities requested within the semantics of the question; and (iii) a language that represents the semantics of the problem and references the zero or one or more unknown entities.
78. The method of any of the preceding claims, wherein a question is represented in the machine-readable language with: (i) A paragraph comprising a machine representation, such as a semantic node, that identifies the paragraph as a question; (ii) A language representing zero or one or more unknown entities requested within the semantics of the question; and (iii) a language that represents the semantics of the problem and references the zero or one or more unknown entities.
79. The method of any of the preceding claims, wherein a question is represented in the machine-readable language with: a paragraph of the form (Question < unowns >) (< passage >) where Question is a semantic node and < unowns > is a list of zero, one or more semantic nodes representing unknown values (similar in meaning to letters of the alphabet in algebra), and where < passage > is where unknown items are used to express the content being asked.
80. The method of any of the preceding claims, wherein generating a response to a query comprises three operations, namely: matching with a machine representation such as a paragraph in a storage area, obtaining and executing a computing unit, and obtaining an inference paragraph.
81. The method of any of the preceding claims, wherein a problem is represented in the memory as a machine representation, and the representation of the problem, the machine representation previously stored in the memory storage area, the computing unit, and the inference paragraph are all represented in substantially the same machine readable language.
Reasoning
82. A method according to any of the preceding claims, wherein an inference is the generation of a machine readable language from other machine readable languages using an inference step, said inference step being represented as a machine representation, such as a paragraph, representing the semantics of the inference step.
83. A method according to any preceding claim, wherein reasoning is performed by answering a series of one or more queries to see if the reasoning step is valid.
84. A method according to any one of the preceding claims, wherein reasoning is performed by answering a series of one or more queries to generate results required for the results of the reasoning.
85. The method according to any of the preceding claims, wherein a machine representation such as a paragraph represents detailed information of a computing unit required for selecting and operating said computing unit, namely: defines what the computing unit can do, how the computing unit is run, and how the results are interpreted.
86. A method according to any one of the preceding claims, wherein the step of obtaining and executing one or more initial inference paragraphs returns further paragraphs with unknown items that need to be processed, and the result of the processing is a junction tree for giving the result of the initial paragraphs.
87. The method of claim 86, wherein the processing of the other paragraphs storing the junction tree and having unknown terms occurs in parallel, allowing data acquisition and exploration of reasoning to be parallelized.
88. The method of claim 87, wherein once all paragraphs have been processed to a given maximum inference depth, a second non-parallelization step is used to traverse the tree of processed paragraphs and unknown mappings to find valid answers.
89. The method of any of the preceding claims, wherein each paragraph in a paragraph list is processed to identify valid mappings from the paragraph memory store and the computing unit, wherein the valid mappings of the paragraph list are the following mappings: all unknown items have values and there is no contradictory mapping between paragraphs in the list.
90. The method of any preceding claim, wherein the step of identifying valid mappings recursively browses the data and finds all valid mappings of the initial question that can be returned as the answer.
91. A method according to any of the preceding claims, wherein at least some of the paragraphs that have been generated from reasoning or computation are stored in a paragraph memory storage area so that they can be used for faster processing in the future.
92. A method according to claim 91, wherein a history of the generated paragraphs is also stored such that changes in trust levels in the paragraphs used to generate the paragraphs can be extended to give trust to the generated paragraphs.
93. A method according to claim 91 or 92, wherein a history of the generated paragraphs is also stored to enable removal of the generated paragraphs when the trusted status of one or more of the paragraphs is used to generate the change.
94. The method of any of the preceding claims 91-93, wherein a new paragraph is assigned a low initial trust value when added by a normal user and a higher start value when added by a privileged user when added to the paragraph memory storage area.
95. A method according to any preceding claim, wherein a signal from an application of the system or method is stored in association with the paragraph used by the application to keep track of the value of the paragraph.
96. The method of any preceding claim, wherein paragraphs are assigned value vectors, wherein the numbers at each index represent different qualities of the paragraphs.
97. The method of claim 96, wherein the different qualities include any of authenticity, availability, and efficiency.
98. The method of any of the preceding claims 96-97, wherein the process of using the paragraphs utilizes a priority vector, wherein the number at each index indicates the priority of the number for that value.
99. The method of any of the preceding claims 96-98, wherein a total value for the paragraph of the procedure is then obtainable from the dot product of the vectors.
100. The method of any preceding claim 96-99 in which an inference engine experiments with high value paragraphs and lower value paragraphs to answer the question, and then monitors the answer provided by the inference engine for any signal indicating whether the lower value paragraph has a positive or negative effect on the answer, and this information is then fed back into an automated monitoring process which re-rates the value of the paragraph with the new signal.
101. The method of any of the preceding claims, wherein the automated monitoring process automatically tests paragraphs to determine if the paragraphs should be used for question answering.
102. A method according to any preceding claim, wherein the machine representation previously stored in memory storage has been monitored in an automated way.
103. The method of any of the preceding claims, wherein the problem is a result of the following translations: the natural language asked by the user is translated into a semantically substantially equivalent representation in the machine-readable language.
104. The method of claim 103, wherein the response to the question is subsequently translated into semantically equivalent natural language and presented to one or more users.
105. The method of any of the preceding claims 103-104, wherein the problem is a result of a translation of: the method comprises translating a question spoken by a user in a natural language into a semantically substantially equivalent representation in the machine-readable language, and then playing a verbal answer to the user, wherein the verbal answer is a result of translating the response to the question into the natural language.
Calculation unit
106. The method according to any of the preceding claims, wherein the computing unit represents individual computing capabilities that can be used for reasoning and other intentions.
107. The method of claim 106, wherein the computing unit is a machine representation such as a semantic node.
108. The method of any of the preceding claims 106-107, wherein a combination of paragraph or semantic nodes represents detailed information of the computing unit required to select and run the computing unit, namely: defines what the computing unit can do, how the computing unit is run, and how the results are interpreted.
109. The method of any of the preceding claims 106-108, wherein the computing unit is suitably used during reasoning.
110. A method according to any one of the preceding claims, wherein a question is represented as one or more machine representations such as paragraphs, and a response to the question is automatically generated using one or more of the following steps: (i) Matching the problem with a machine representation previously stored in a memory storage area; (ii) Acquiring and executing one or more computing units, wherein a computing unit represents computing power associated with answering the question; (iii) One or more inference machine representations, such as inference paragraphs, are obtained and executed, which are machine representations representing the semantics of potentially applicable inference steps related to answering the question.
Learning
111. A method according to any preceding claim, wherein the learned new information is represented in a machine representation conforming to the machine readable language.
112. A method according to any of the preceding claims, wherein learning new information is obtained from automatically processing the machine representation to obtain or learn new information, and the new information itself is represented as a machine representation stored in memory.
113. A method according to any preceding claim, wherein learning new information is obtained from a machine learning system that generates a classification or prediction or other output represented as a machine representation such as a paragraph.
114. The method according to any of the preceding claims, wherein a machine learning system processes the machine representations such as semantic nodes and paragraphs to obtain or learn new information.
115. The method according to any of the preceding claims, wherein new information is generated by automatically processing the machine representations such as semantic nodes and paragraphs to answer questions.
116. The method of claim 115, wherein the representation of the problem, the machine representation previously stored in the memory storage area, the computing unit, and the inference paragraph are all represented in substantially the same machine-readable language.
117. The method according to any of the preceding claims, wherein new information is represented as a machine representation such as a semantic node or paragraph and is stored and used to improve learning new facts.
118. A method according to any of the preceding claims, wherein the new information is represented as a machine representation such as a semantic node or paragraph and is stored and used to improve the reasoning step.
119. The method according to any of the preceding claims, wherein new information is represented as a machine representation such as a semantic node or paragraph and stored and used to interpret or describe the new information in natural language.
120. The method according to any of the preceding claims, wherein the new information is represented as a machine representation such as a semantic node or paragraph and is stored and used for a text conversation or a spoken conversation with a human user.
121. The method according to any of the preceding claims, wherein learning new information occurs from a dialogue with or other natural language provided by a human user, wherein the natural language provided by the user in spoken or written form is translated into machine representations such as semantic nodes and paragraphs, and then storing and using the new information represented by these machine representations.
122. A method according to any one of the preceding claims, wherein learning is performed from reasoning, wherein machine representations generated from a series of reasoning steps are combined with the reasoning paragraphs and stored and used.
123. A method according to any one of the preceding claims, wherein learning is performed in accordance with natural language, wherein the resulting semantic nodes or paragraphs are subsequently used by an application by translating all or part of a document source of natural language, such as a web page, scientific paper or other article, into a machine representation.
124. The method according to any of the preceding claims, wherein a non-document source comprising natural language including an audio recording or video of human speech is used and a text transcription of the speech recording is first created using speech recognition techniques and then translated into a machine representation.
125. The method of any of the preceding claims, wherein a machine learning system is used to analyze document and non-document data and create a machine representation from the data.
Semantic node resolution
126. A method according to any preceding claim, wherein a service is provided, the service being operable to receive a description of an entity and return one or more identifiers of a machine representation corresponding to the entity such that a user can use a shared identifier for the entity.
127. The method of claim 126, wherein the description is partially or wholly described in the machine-readable language.
128. The method of any of the preceding claims 126-127, wherein the description is written in part or in whole in one or more natural languages.
129. The method of any of the preceding claims 126-128, wherein the service compares the description of proposed machine representations with available information about existing entities to determine if there is a match.
130. The method of any of the preceding claims 126-129, wherein the service probabilistically determines whether there is a match.
131. The method of any of the preceding claims 126-130, wherein the service additionally returns a match probability and the one or more identifiers.
132. The method of any of the preceding claims 126-131, wherein the service returns a new identifier if no match is found.
Principle of
133. The method of any of the preceding claims, wherein the machine representation comprises one or more principles, statements, or other rules defining the goal or motivation that are also represented using the machine representation; analyzing potential actions to determine if performing the actions would optimize or otherwise affect achievement or implementation of those principles, statements, or other rules; and an action is selected, decided or performed only if it optimizes or otherwise positively affects the achievement or implementation of those principles, statements or other rules.
134. The method of claim 133, wherein actions conforming to the principle, statement, or other rule ("principle") are automatically proposed by reference to the principle.
135. The method of any of preceding claims 133-134, wherein the action includes communicating with a user in written form.
136. The method of any of the preceding claims 133-135 in which the action includes communicating with the user in verbal form.
137. The method of any of the preceding claims 133-136 in which the principle includes at least one metric, such as user happiness, that the system should attempt to maximize.
138. The method of any of the preceding claims 133-137 in which the principle includes at least one metric, such as user unfortunately, that the system should attempt to minimize.
139. The method of any of the preceding claims 133-138 in which the principle comprises at least one rule that no action can be taken.
140. The method of any of preceding claims 133-139, wherein avoiding the act is achieved by referencing the principle.
141. The method of any of the preceding claims 133-140 wherein the principles include: at least one suggestion of what action to do under defined conditions.
142. The method of any of the preceding claims 133-141 in which the principals include sub-principals that are principles related to other principals or are more specific examples of another principle.
143. The method of any of preceding claims 133-142 in which the action includes accessing other remote computer systems.
144. The method of any of the preceding claims 133-143 in which the action includes changing the state of a device linked to the system via a network.
145. The method of any of the preceding claims 133-144 in which the action includes initiating a verbal interaction with a human.
146. The method of any of the preceding claims 133-145 in which a data store contains a machine representation of a world encoding meaning, and inferences made with reference to the machine representation of the world, the inferences occurring to select actions that are in accordance with the principles.
147. A method according to claim 146, wherein the machine representation of the world includes a representation of an effective inference step and the representation of the effective inference step is used for inference.
148. The method of any of the preceding claims 146-147, wherein the machine representation of the world includes a representation of computing power and the computing power is utilized by referencing the machine representation.
149. The method of any of the preceding claims 146-148, wherein the machine representation of the world is learned and augmented.
150. The method of any of the preceding claims 133-149 in which learning is achieved using communication with at least one user.
151. The method of any of the preceding claims 133-150 in which at least one external sensor connected to the system via a network is used for learning.
152. The method of any of the preceding claims 133-151 in which the machine-readable principles are represented at least in part by a combination of identifiers, and in which at least some of the identifiers represent concepts corresponding to real world things.
153. The method of any of the preceding claims 133-152 in which a description of a concept from a remote system is received and used to return an identifier that may mean the concept.
154. The method of any of preceding claims 133-153 in which continuous reasoning occurs in a manner that results in actions consistent with the principles.
155. The method of any of the preceding claims 133-154 in which questions from a human user regarding the principle are answered.
156. The method of any preceding claim, the method being implemented in a computer system comprising: a long-term memory; a short term memory; a rules store containing machine readable rules representing rules governing the system, and wherein the computer system is operable to receive events and utilize the events, the contents of the long term memory, the contents of the short term memory, and the rules to perform actions consistent with the rules.
157. The method of claim 156, wherein the computer system comprises: means for generating a candidate action; means for determining whether to perform the candidate action and means for performing the action are referenced by the principle.
158. The method of any of the preceding claims 156-157, wherein answering questions asked by a human user comprises two actions: a response to the question is generated and the response is transmitted to the human user.
159. The method of any of the preceding claims 156-158, wherein the event comprises a transmission from at least one user, and wherein the action comprises: to at least one user.
160. The method of any of the preceding claims 156-159 wherein the system is further operable to learn and store content that the system has learned to the long term memory.
161. The method of any of the preceding claims 156-160, wherein the computer system is not operable to change the principle.
162. The method of any of the preceding claims 133-161 in which the principals include inhibiting a principle that may result in an action that alters the principle.
163. The method of any of the preceding claims 133-163 in which a separate check for each potential action is performed against the principals and potential actions are discarded if the separate check finds that they are not compatible with any of the principals.
164. The method of any of the preceding claims 156-163, wherein the computer system is further operable to actively exclude knowledge about itself for determining an action.
165. The method of any of the preceding claims 156-164, wherein potential actions are autonomously generated by the computer-based system.
166. The method of any of the preceding claims 156-165, wherein potential actions are autonomously generated by the computer-based system as output from processing input such as audio or text.
167. The method of any of the preceding claims 156-166, wherein the potential action is generated autonomously using a substantially continuously operating process.
168. The method of any of the preceding claims 156-167, wherein the potential action is autonomously generated without any external triggering event for initiating processing or user instruction or action for initiating processing.
169. The method of any of the preceding claims 156-168, wherein the potential action is automatically performed if the potential action optimizes or otherwise positively affects the achievement or implementation of the principle, statement, or other rule.
Privacy preserving mode
170. The method of any of the preceding claims, wherein a first wake word initiates processing and then enters a privacy-preserving state requiring a second wake word, and wherein the second wake word is long enough or unusual that it is much less likely to be misidentified relative to the first wake word.
Multiple different voice assistants
171. The method of any of the preceding claims, wherein the experience of a plurality of different voice assistants is delivered to a plurality of users and at least one data store contains personality information that determines personalities of at least some of the plurality of different voice assistants.
172. The method of claim 171, wherein the personality information includes information regarding: the gender or name of the voice assistant or the voice or the mood or emotional response or formality or location on the outside-in vector table or location on any of the meissbrix scales or classification in meissbrix classification or personality testing or visual appearance.
173. The method of any of the preceding claims 171-172, wherein there is at least one set of machine-readable principles representing purposes and rules that direct at least some of the plurality of voice assistants, and then act in accordance with the principles by referencing the principles.
174. The method of any of the preceding claims 171-172, wherein the at least one set of machine-readable principles is a plurality of sets of machine-readable principles, and wherein a selected one of the plurality of different voice assistants is mapped to a selected one of the plurality of sets of machine-readable principles, wherein different voice assistants are driven by different principles.
175. A computer-based system configured to analyze data, the system configured to:
(a) Storing in memory a structured machine-readable representation of data conforming to a machine-readable language; the structured machine-readable representation of data includes: a representation of user speech or text input for a human-machine interface;
(b) The structured representation is automatically processed to analyze the user speech or text input for a human-machine interface.
176. A computer-based system configured to implement the method of any of the preceding claims 1-174.
177. The computer-based system of claims 175-176 configured as a voice assistant.
178. The computer-based system of claim 177, being a voice assistant device configured to control items in a home, automobile, or other environment using the user speech or text input.
179. The computer-based system of claim 177 or 178, being a voice assistant device configured to run at least in part on a smartphone, notebook computer, smart speaker, or other electronic device.
180. The computer-based system of claim 177 or 178, being a voice assistant device configured to run at least partially on a cloud or central server and at least partially on an edge device.
181. A computer system comprising a processor and a memory, the processor being configured to answer a question, the processor being configured to use a processing language, wherein semantic nodes are represented in the processing language, the semantic nodes comprising semantic links between semantic nodes, wherein the semantic links are themselves semantic nodes, wherein each semantic node marks a specific meaning, wherein a combination of semantic nodes defines a semantic node, wherein expressions in the processing language can be nested, wherein the question is represented in the processing language, wherein an inference step is represented in the processing language to represent semantics of the inference step, wherein a computing unit is represented in the processing language, wherein the memory is configured to store the representation in the processing language, and wherein the processor is configured to answer the question using the inference step, the computing unit, and the semantic nodes, and store an answer to the question in the memory.
182. The computer system of claim 181, wherein the computer system is configured to output the answer to the question.
183. The computer system of claim 181, wherein the computer system is configured to output the answer to the question to a display device.
184. The computer system of any one of claims 181 to 183, wherein expressions in the processing language are nestable without limitation inherent to the processing language.
185. The computer system of any one of claims 181 to 184, wherein the semantic nodes each comprise a unique identifier.
186. The computer system of any one of claims 181 to 185, wherein the computing unit is a semantic node.
187. The computer system of any one of claims 181 to 186, wherein the problem is represented in the processing language by: a paragraph comprising semantic nodes that identify the paragraph as a question; a list of semantic nodes representing zero, one or more of the unknown entities being questioned; and at least one further paragraph representing the semantics of the problem in the context of the zero, one or more unknown entities.
188. The computer system of any one of claims 181 to 187, wherein the processing language is a general purpose language.
189. The computer system of any one of claims 181 to 188, wherein the processing language is not a natural language.
190. The computer system of any one of claims 181 to 189, wherein the problem relates to searching and analysis of documents or web pages, wherein the semantic node comprises a representation of at least a portion of the documents or web pages stored in a document store.
191. The computer system of any one of claims 181 to 190, wherein the problem relates to a location-based search using map data represented in the processing language as semantic nodes.
192. The computer system of any one of claims 181 to 191, wherein the question relates to a search for defined advertisements or news, wherein the semantic node comprises a representation of an advertisement, a news article, or other information item.
193. The computer system of any one of claims 181 to 192 wherein the question relates to a request for a summary of a news topic, wherein the semantic node includes representations of news from multiple sources, such as to provide a summary or aggregation of the news.
194. The computer system of any one of claims 181 to 193, wherein the problem relates to a request for compatibility matches between persons, wherein for a plurality of persons the semantic node comprises a representation of personal information defining one or more attributes of the person.
195. The computer system of any one of claims 181 to 194, wherein the question relates to compliance with a requirement to prevent abuse or illegal social media posts, wherein the semantic node comprises a representation of a social media post.
196. The computer system of any one of claims 181 to 195, wherein the question involves analyzing customer reviews, wherein the semantic node comprises a representation of a customer review.
197. The computer system of any one of claims 181 to 196 wherein the question relates to a user's product request, wherein the semantic node comprises a representation of a product description and a user's product request.
198. The computer system of any one of claims 181 to 197, wherein the question relates to a job search, wherein the semantic node includes representations of job descriptions and skills and experiences of job seekers to determine which job seekers match job descriptions or to determine which job descriptions match skills and experiences of job seekers.
199. The computer system of any one of claims 181 to 198, wherein the problem relates to the health of an individual, wherein the semantic node comprises health data related to the individual and health data related to a human.
200. The computer system of any one of claims 181 to 199 wherein the problem relates to nutrition wherein the semantic nodes include nutritional data for foods and beverages.
201. The computer system of any one of claims 181 to 200, wherein the problem relates to accounting or finance, wherein the semantic node includes a representation of finance or accounting information.
202. The computer system of any one of claims 181 to 201 wherein the question is received by a voice assistant or chat robot, wherein the semantic node comprises a representation of user verbal input for a human-machine interface and comprises a representation of the human-machine interface itself.
203. A computer-implemented method using a computer system comprising a processor and a memory, the processor configured to use a processing language, wherein semantic nodes are represented in the processing language, the semantic nodes comprising semantic links between semantic nodes, wherein the semantic links are themselves semantic nodes, wherein each semantic node marks a specific meaning, wherein a combination of semantic nodes defines a semantic node, wherein expressions in the processing language can be nested, wherein the question is represented in the processing language, wherein an inference step is represented in the processing language to represent semantics of the inference step, wherein a computing unit is represented in the processing language, wherein the memory is configured to store the representation in the processing language, the method comprising the steps of:
(i) The processor answers the question using the reasoning step, the computing unit and the semantic node, and
(ii) The processor stores answers to the questions in the memory.
204. The method of claim 203 wherein the problem is represented in the processing language by: a paragraph comprising semantic nodes that identify the paragraph as a question; a list of semantic nodes representing zero, one or more of the unknown entities being questioned; and at least one further paragraph representing the semantics of the problem in the context of the zero, one or more unknown entities.
205. The method of claim 204 wherein identifying an unknown item in the question and selecting the paragraphs that constitute a subject of the question for further analysis; processing begins with a paragraph list of the subject and selected unknown items from the question; selecting a first paragraph in the paragraph list for processing; processing a single paragraph includes three methods: using statically stored processing language paragraphs, using the computing unit, and using processing languages generated from reasoning:
Wherein the first method is to find out in the paragraph store if there are any paragraphs that can be mapped directly with the paragraph being processed; if the paragraph has exactly the same structure as a paragraph in the paragraph store, all nodes except the unknown item match, then the value to which the unknown item matches is a valid result;
the second method is to check whether any result can be found by executing the calculation unit; checking whether the paragraph matches any paragraph in the computational unit description; all non-unknown nodes in the paragraph being processed must match the same node in the corresponding location in the computational description or be aligned with a computational input unknown item; the unknown item being processed must be aligned with the output unknown item in the description; then invoking the computation unit to obtain a valid output value of the unknown item of the processed paragraph;
a third method is to see if the paragraph can be proved by applying any reasoning steps; searching the reasoning step to find where a paragraph in the second half of the reasoning paragraphs can be unified with the paragraph being processed; all nodes and structures must be equal between the two paragraphs except for the unknown item in the focus paragraph or the inference paragraph; if such an inferencing paragraph is found, it means that the inferencing step can prove that the paragraph is processing; upon matching the inferred paragraph, a multi-stage process is used to first find any mapping of unknown items in the processed paragraph; secondly, find a mapping of unknown items used in the inferential paragraph through a mapping with the paragraph being processed; this mapping can then be applied to the first half of the inferred paragraph to generate a paragraph list that, if matched with the known or generated processing language and the mapping found for the paragraph list, will prove and find a valid mapping for the focus paragraph; the solution to the paragraph list can then be found recursively.
206. The method of any one of claims 203 to 205 using the computer system of any one of claims 181 to 202.
CN202180072454.8A 2020-08-24 2021-08-24 Computer-implemented method for automatically analyzing or using data Pending CN116457774A (en)

Applications Claiming Priority (5)

Application Number Priority Date Filing Date Title
GB2013207.2 2020-08-24
GB2014876.3 2020-09-21
GBGB2020164.6A GB202020164D0 (en) 2020-12-18 2020-12-18 Unlikely AI patent III
GB2020164.6 2020-12-18
PCT/GB2021/052196 WO2022043675A2 (en) 2020-08-24 2021-08-24 A computer implemented method for the automated analysis or use of data

Publications (1)

Publication Number Publication Date
CN116457774A true CN116457774A (en) 2023-07-18

Family

ID=74221457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180072454.8A Pending CN116457774A (en) 2020-08-24 2021-08-24 Computer-implemented method for automatically analyzing or using data

Country Status (2)

Country Link
CN (1) CN116457774A (en)
GB (1) GB202020164D0 (en)

Also Published As

Publication number Publication date
GB202020164D0 (en) 2021-02-03

Similar Documents

Publication Publication Date Title
US11586827B2 (en) Generating desired discourse structure from an arbitrary text
US11763096B2 (en) Computer implemented method for the automated analysis or use of data
US20170017635A1 (en) Natural language processing system and method
US11474979B2 (en) Methods and devices for customizing knowledge representation systems
US20230274086A1 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
US11977854B2 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
US20230259705A1 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
US20230274094A1 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
US20230274089A1 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
WO2023161630A1 (en) Computer implemented methods for the automated analysis or use of data, including use of a large language model
CN116457774A (en) Computer-implemented method for automatically analyzing or using data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination