Towards Reasonable Agents

First thoughts on an upcoming IEML_GPT

Reminder: « I work from the perspective of artificial intelligence dedicated to increasing collective intelligence. I designed IEML to serve as a semantic protocol, enabling the communication of meanings and knowledge (mental models) in digital memory, while optimizing machine learning and automatic reasoning. »

For more information on IEML: https://journals.sagepub.com/doi/10.1177/26339137231207634

About GPT Builder: https://help.openai.com/en/articles/8554397-creating-a-gpt

VISION

Let’s imagine a knowledge-sharing system that makes the most of today’s technical possibilities. At the heart of this device is an open ecosystem of knowledge bases categorized in IEML, which emerge from a multitude of communities of research and practice. Between this core of interoperable knowledge bases and living human users lies a « no-code » neural interface (an ecosystem of models) that provides access to data control, feeding, exploration and analysis. Everything happens intuitively and directly, according to the sensory-motor modalities selected. It is also via this giga-perceptron – an immersive, social and generative metaverse – that communities exchange and discuss the data models and semantic networks that organize their memories. In keeping with good knowledge management, the new knowledge-sharing device encourages the recording of creations, accompanies learning paths and presents useful information to players engaged in their practices. The IEML_GPT model described here is a first step in this direction.

Now that AI has been unleashed on the Internet and coupled with social media, we need to tame and harness the monster. How do we make AI reasonable? How do we get it to « understand » what we’re saying to it, and what it’s saying to us, rather than just calculating word occurrence probabilities from training data? We’d have to teach it the meaning of words and phrases in such a way that it (the AI) forms an abstract representation *understandable for itself* not only of the physical world (I’ll leave that task to Yann LeCun), but also a representation of the human world and, more generally, of the world of ideas.

In other words, how can we graft symbolic encoding and decoding capabilities onto a neural model that can initially only recognize and generate sensory forms or aggregates of signifiers? This challenge is reminiscent of the process of hominization – when biological neural networks became capable of manipulating symbolic systems – which is not to my displeasure.

UNDERSTANDING / KNOWLEDGE / INTEROPERABILITY

To understand a sentence is to include it in the self-defining dynamics of a language, and this even before grasping the sentence’s extralinguistic reference. AI will understand what is being said to it when it is capable of automatically transforming a character string into a semantic network that plunges into the self-referential and self-defining loop of a language. A language’s dictionary, with its definitions, is a crucial part of this loop. Just as a deduction ultimately represents a logical tautology, a language dictionary exhibits a *semantic tautology*. This is why IEML_GPT must contain a file with the IEML-French-English dictionary (and perhaps other languages) with all the relations between words in the form of IEML phrases. The dictionary is a meta-ontology that is the same for all users. Other files may contain local models or ontologies corresponding to user communities’ ecosystems of practice.

1) Linguistic understanding. Reasonable agents are able to recognize and generate syntactically valid IEML character sequences, in particular by means of a parser. They have an understanding of IEML: they reconstruct the recursively embedded syntagmatic trees and relationships between concepts that derive from the dictionary and the paradigmatic matrices (or substitution groups) that organize the concepts. Each concept (represented by an IEML word or phrase) is thus at the center of a star of syntactic and semantic relationships. 

2) Practical domain knowledge. Reasonable agents are driven by knowledge bases that enable them to understand (locally) the world in which they have to operate. They have models (ontologies or knowledge graphs in IEML) of the practical situations facing their users. They are able to reason on the basis of these models. They are able to relate the data they acquire and the questions they are asked to these models.

3) Semantic interoperability. Reasonable agents share the same language (IEML) and therefore understand each other. They can exchange models or sub-models. They transform natural language expressions into IEML and IEML expressions into natural languages: they can therefore understand humans and make themselves understood by them.

TASK 1 : THE DICTIONARY

1.0 I already have about three thousand words in the dictionary, organized into paradigms, a formal grammar, a parser to validate sentences and a library of functions to generate paradigms. Here is the IEML dictionary.

1.1 The first step is to create concept-phrases to express the *sets of words* (lexical families and semantic fields) represented by the paradigms, their columns, rows and so on. Let’s call the concepts defining these sets of words « lexical concepts ». Words in the same lexical family share common syntactic features and often belong to the same root paradigms. They will have to be created systematically by means of paradigmatic functions.

I need to find ways of generating lexical concept paradigms automatically in natural language with IEML_GPT rather than using the current editor, which is not easy to use.

1.2 The second step is to create all the « analytical propositions » that define the words in the dictionary and explain their relationships by means of words and lexical concepts. For example: « A mountain is bigger than a hill »; « Sociology belongs to the humanities ». Analytical propositions of this kind are always true, and define a meta-ontology. So we’ll need to create the paradigms of the dictionary’s *relations*. And have them generated by IEML_GPT from natural language instructions.

1.3 All internal relationships of the dictionary, materialized by hyperlinks, are created by sentences. In terms of the user interface, this means creating internal hypertext links (between words and lexical concepts) in such a way that their grammatical relationships are as clear as possible. The dictionary-hypertext document must also be generated automatically by IEML_GPT. For each word, we’ll obtain a list (a « page? ») of true sentences containing the word. This list will be organized by grammatical role: word defined in root role, word defined in object role, etc. Here is a concise version of the IEML grammar.

These sentences will be used not only to define words, but also to begin accumulating examples and even training data, with correspondence between formal IEML phrases and literary translations in French and English. In short, the first finished product will be a complete dictionary, with words, lexical concepts and inter-definition relations in hypertextual form, all in IEML, English and French.

TASK 2: AN ONTOLOGIES EDITOR

Task 1 will have tested the best ways of creating paradigms using instructions in natural languages, or even using templates to ease the workload of ontology designers.

The output of the ontology editor could be in RDF, JSON-LD, or in the form of a hypertext document. It could also be an interactive multimedia document: tables, trees, networks of concepts that can be explored, verbal/sound illustrations, etc.

Ideally, the ontology we create should natively contain an inference engine, thus supporting automatic reasoning. The intellectual property of ontology creators must be recognized.

IEML_GPT will be able to run any IEML ontology or set of ontologies.

TASK 3: AUTOMATIC CATEGORIZATION

The next step is to build an integrated tool for automatic categorization of data in IEML. The AI is given a dataset and an IEML ontology (ideally in the form of a reference file), and the result is a set of data categorized according to the terms of the ontology. The completion of Task 3 paves the way for the creation of a knowledge base ecosystem as described in the vision above.

All these steps will first have to be carried out « in small » (proof of concepts and agile method) before being fully implemented.

Publié par Pierre Lévy

Assiociate Researcher at the University of Montreal (Canada), Fellow of the Royal Society of Canada. Author of: Collective Intelligence (1994), Becoming Virtual (1995), Cyberculture (1997), The Semantic Sphere (2011) and several other books translated in numerous languages. CEO of INTLEKT Metadata Inc.

Laisser un commentaire