Theoretical Principles

Click here for a general introduction to IEML.

Abstract

IEML (Information Economy MetaLanguage) is a language with computable semantics that can be considered from three complementary points of view: linguistics, mathematics and computer science. From the linguistic point of view, it is a philological language, i.e. it can translate any natural language and is able to define its own words. Mathematically, it is a a category, i.e. an algebraic structure in isomorphic relation with a graph structure (a network of semantic relations between concepts). Finally, on the computing side, it functions as the indexing system of a virtual database and as a declarative programming language for semantic networks, ontologies and dataset labels. IEML solves a series of problems in the fields of artificial intelligence (labels for Machine Learning and ontologies for Knowledge Graphs) and knowledge management.

The Problem of Coding Linguistic Meaning

There is currently no way in computer science to code the linguistic meaning in a uniform and computable way, as we can, for example, code images by the way of pixels or vectors. In order to represent meaning, we are still using natural languages, which are notoriously multiple, changing, and ambiguous. With the notable exception of number notation and mathematical codes, our writing systems are primarily designed to represent sounds. Their representation of categories or concepts is indirect (characters → sound → concepts) and difficult for computers to grasp. Computers can handle syntax (the regular arrangement of characters), but their handling of semantics remains imperfect and laborious. Despite the success of machine translation (Deep L, Google translate) and automatic text generation (GPT3, ChatGPT, etc.), computers don’t really understand the meaning of the texts they read or write.

Many advances in computer science come from the discovery of a relevant coding system in order to make the coded object (number, image, sound, etc.) easily computable. Now we want to make concepts or categories – linguistic meaning – systematically computable.

How do languages work? Let’s put aside for the moment the obstacles of ambiguity and misunderstanding and sketch the main process. On the receiving end, we hear a sequence of sounds that we translate into a network of concepts, thus giving meaning to a linguistic expression. On the emission side, from a network of concepts we have in mind – a meaning to be transmitted – we generate a sequence of sounds. Language functions as an interface between sequences of sounds and networks of concepts. The sound chains can be replaced by sequences of ideograms, letters, or gestures as in the case of sign language. The quasi-automatic interfacing between a sequence of images from the senses (sound, visual, tactile), and a graph of abstract concepts (general categories) remains constant among all languages and writing systems.

This reciprocal translation between a sequence of images (the signifier) and networks of concepts (the signified) is the very principle of the IEML mathematical model, namely a mathematical category organizing a correspondence between an algebra and a graph structure. The algebra regulates the reading and writing operations on the sequential texts, while the graph structure organizes the operations on the nodes and the oriented links of the concept networks. Each text corresponds to a concept network, and the operations on the texts dynamically reflect the operations on the concept graphs.

To encode mathematically chains of signifiers, I used a formal language. Such a formalism makes it possible to automatically transform sequences of symbols into syntagmatic trees – reflecting the dependency structure between words in a sentence – and vice versa. However, if its syntagmatic tree is indispensable for understanding the meaning of a sentence, it is not sufficient. Indeed, each linguistic expression lies at the intersection of a syntagmatic axis and a paradigmatic axis. The syntagmatic tree represents the internal semantic network of a sentence, the paradigmatic axis represents its external semantic network and in particular its relations with sentences having the same structure, but from which it differs by some different words. To understand the phrase « I choose the vegetarian menu », one must of course recognize that the verb is « to choose », the subject « I » and the object « the vegetarian menu » and know moreover that « vegetarian » qualifies « menu ». But one must also know the meaning of words and know, for example, that vegetarian differs from meaty, flexitarian and vegan, which implies going beyond the sentence to situate its components in systems of taxonomy and semantic oppositions, those of language as well as those of various practical fields. The establishment of semantic relations between concepts thus implies the recognition of syntagmatic trees internal to sentences, but also paradigmatic matrices external to the sentence. This is why IEML algebraically encodes not only the syntagmatic trees, but also the paradigmatic matrices where words and concepts take their meaning.

In short, each sentence in IEML is located at the intersection of a syntagmatic tree and paradigmatic matrices. In addition to a regular grammar, IEML relies on a dictionary of about 3000 words – without synonyms or homonyms – organized in a little more than a hundred paradigms. These words have been chosen to allow the recursive construction of any concept or paradigm of concepts by means of sentences. It should be noted that, in contrast to natural languages which are ambiguous and irregular, the expressions in IEML are univocal. On the basis of the fractal – but regular – syntagmatic-paradigmatic grid of IEML, it then becomes possible to generate and recognize semantic relations between concepts in a functional – and therefore programmable – way.

The complexity of semantics

Let us now turn to the question of semantics in general, which cannot be confined to the linguistic field and obviously depends on the practical context and the modes of veridiction at stake. When a sentence is pronounced, it makes sense on at least three levels.

Conceptualization: the mental representation prompted by its grammatical structure and the meaning of its words (a speech evoques a network of concepts)
Veridiction: the logical plane of its reference to a state of things (a speech is true or false)
Interaction: the practical plane of social interaction (a speech is a move in a language game).

These three meanings – linguistic for the conceptualization, referential for the veridiction and pragmatic for the interaction – echo the Trivium of Roman antiquity and the Western Middle Ages: grammar, dialectics and rhetoric.

Pragmatic semantics, or interaction

Language functions primarily as a tool for exercising, regulating and representing social interactions. In fact, even personal thought – which does not go beyond the limits of our innermost being – takes the form of a conversation. According to Lev Vygotsky, thought is the result of an internalization of dialogue. When we ask ourselves about the meaning of a sentence, we must therefore first note its pragmatic and dialogical dimensions: the type of language game played by the interlocutors, the circumstances of its enunciation, its potential effects. In sum, the enunciation of a sentence is an act. And the pragmatic meaning of the sentence is the effect of this act as it is recorded in the collective memory of the interlocutors. Pragmatic semantics is tied up in the relationship between a speech act – more generally a symbolic act – and a social situation. It is mainly related to game theory, law, sociology, and even systems theory.

Referential semantics, or truthfulness

At the level of pragmatic semantics, a sentence is more or less relevant or effective, happy or unhappy, according to its effects in context. But for an utterance to be capable of such a pragmatic meaning, it must also be capable of describing a reality, be it exact or inexact, serious or fictitious. Like states of consciousness, propositions are intentional, that is: they point to a reference. Pragmatic semantics records the way in which an utterance modifies a social situation, whereas referential semantics focuses on the relation between an utterance and a state of things. A proposition is true or false only at the level of referential semantics. It is also at this level that its truth value is transmitted – or not – from one proposition to another in a logical reasoning. Let us note that if language were not likely to carry the referential semantics – that is to say to represent reality and to tell the truth – it could not play its pragmatic role. It would be impossible, for example, to state the rules of a game or to evoke social interactions. Referential semantics is more a matter of the exact sciences and logic and stands out against the background of an objective reality. Here, linguistic expressions describe and index the world of interlocutors and allow logical reasoning.

Linguistic semantics, or conceptualization

Just as pragmatic semantics has referential semantics as its condition of possibility, referential semantics in turn can only manifest itself on the basis of linguistic semantics. Indeed, a sentence must first have a meaning in a certain language and evoke a certain mental representation – a concrete or metaphorical scene – in order to be able to compare this conceptualization with reality and declare it true, false or half true. The linguistic meaning of an expression comes from the words it is composed of and the meaning assigned to them in a dictionary. It also comes from the grammatical roles that these words play in the sentence. In sum, linguistic meaning emerges from the inter-definitional, suitability, similarity, and difference relations between words in the dictionary and the grammatical relations between words in the sentence. A different circumstantial complement, the replacement of one verb by another, a singular instead of a plural would have changed the meaning of the sentence and produced a different narrative. The linguistic meaning is differential. At this level of semantics, the meaning of a sentence is determined in its relation to the language, and even in the relation to its translation into other languages. Prior to the question of truth, it has more affinities with literature than with logic.

IEML has been designed to solve the problem of coding the linguistic or conceptual meaning (carefully distinguished from pragmatic and referential meaning).

A mathematical language

The syntax of IEML is mathematical: It is defined by two nested functions. To create words, a morphological function with three multiplicative roles operates on an alphabet of six primitive variables. To create sentences, a syntagmatic function with nine multiplicative roles (verb, subject, object, etc.) operates on the alphabet of 3000 words generated by the morphological function. The paradigms and their symmetries are nothing but the matrices generated by these functions when one, two or three of their roles are variable. Each word in the IEML dictionary belongs to one and only one paradigm. A sentence, on the other hand, can belong to several paradigms and the sentence paradigms are freely created by the speakers. On the regular grid that crosses syntagmatic trees and paradigmatic matrices, link creation functions allow to weave semantic relations from syntactic conditions and to create as many hypertexts or knowledge graphs as one wants. In short, the semantics of IEML is computable.

A philological language

The structure of the IEML sentence allows for the description of any complex interaction. The verbs can be affirmative, negative, interrogative or other and they decline the whole range of logical modalities and grammatical modes. The sentence is recursive since the concepts that compose it can be actualized by words or sentences, sentences that can in turn contain sentences, and so on. IEML has the narrative power of natural languages. IEML’s dictionary contains all kinds of deictics and its grammar allows for the explicitation of reference (extralinguistic) and self-reference (references of linguistic expressions to other expressions) operations. Finally, our metalanguage allows in principle to translate any concept, story or reasoning expressed in natural language. The IEML editor’s parser already allows to write texts in IEML using only French or English words.