Go there for a general introduction to IEML
IEML (Information Economy MetaLanguage) is a language with computable semantics that can be considered from three complementary points of view: linguistics, mathematics and computer science. From the linguistic point of view, it is a philological language, i.e. it can translate any natural language and is able to define its own words. Mathematically, it is a topos, i.e. an algebraic structure (a category) in isomorphic relation with a topological space (a network of semantic relations). Finally, on the computing side, it functions as the indexing system of a virtual database and as a programming language for semantic networks, ontologies and dataset labels. IEML solves a series of problems in the fields of artificial intelligence (labels for Machine Learning and ontologies for Knowledge Graphs) and knowledge management.
The Problem of Coding Linguistic Meaning
There is currently no way in computer science to code the linguistic meaning in a uniform and computable way, as we can, for example, code images by the way of pixels or vectors. In order to represent meaning, we are still using natural languages, which are notoriously multiple, changing, and ambiguous. With the notable exception of number notation and mathematical codes, our writing systems are primarily designed to represent sounds. Their representation of categories or concepts is indirect (characters → sound → concepts) and difficult for computers to grasp. Computers can handle syntax (the regular arrangement of characters), but their handling of semantics remains imperfect and laborious. Despite the success of machine translation (Deep L, Google translate) and automatic text generation (GPT3), computers don’t really understand the meaning of the texts they read or write.
Many advances in computer science come from the discovery of a relevant coding system in order to make the coded object (number, image, sound, etc.) easily computable. Now we want to make concepts or categories – linguistic meaning – systematically computable. In order to understand how IEML can solve this problem, we must recall a number of semiological principles.
Linguistic versus referential meaning
Meaning circulates between three poles: the sign, the interpreter and the referent.
The sign is what represents or indicates.
The referent is the object of the sign, what is represented or indicated.
The interpreter – a subjectivity, a mind, a self-referential process – is the person for whom the sign represents the referent. A sign must be recognized for such and interpreted in a certain context in order to point to its referent.
Symbols, especially linguistic symbols, are particular types of signs that are divided into two parts related to each other: a signifier (a sound, image, gesture, etc.) and a signified (a general category). While the relationship between sign and referent depends largely on context and interpretation, the correspondence between signifier and signified (internal to the sign) is fixed by a linguistic convention. It does not depend on our personal interpretation that the sound « horse » refers to a domestic animal capable of galloping: it depends on the English language. It is thanks to this relatively fixed convention that the linguistic code enable communication and the accumulation of a collective memory.
Although they work together in practical situations, it is important to distinguish logically between referential meaning and linguistic meaning. Referential meaning points to the relationship between a symbol and its referent (an « external » object in a particular context). In contrast, linguistic meaning is a relation – « internal » to the symbol – between its material part (a sound, image, gesture), and its conceptual part (a category). IEML solves the problem of coding the linguistic meaning in a computable way.
IEML is simultaneously a regular language, a philological language and a virtual database.
In order to solve the problem of coding the linguistic meaning we use a regular language as a system of signifiers for IEML. Furthermore we establish – by convention and by construction – that in IEML the relations between signifieds are calculable functions of the relations between the corresponding signifiers. Since the relations between signifiers in IEML are computable (it is a regular language), the relations between its signifieds are also computable. It is thus thanks to its algebraic system of signifiers and by virtue of an ideographic parallelism between signifier and signified that IEML has a computable linguistic semantics.
IEML is created from 6 primitive symbols which are combined using regular functions to generate about 3000 elements. A grammar – also completely regular – is then used to build words and sentences from these elements. The IEML elements, words, sentences and the semantic relationships between these grammatical units are generated by programmable functions.
Just as the human memory is organized by natural languages, computer memory could be organized by a linguistic code adapted to its mechanical nature, such as IEML. According to the semiotic theory on which it is based, IEML clearly distinguishes the notation of categories (in the form of semantic metadata) from the notation of references (in the form of data). Furthermore, IEML has a notation that assigns data to categories. IEML functions as a virtual database whose indexing system is a language with computable semantics. This computability of the language includes the ability to program sets of categories and semantic relations.
Overview of semantic coding: USLs and UKGs
What is a USL?
A USL or Uniform Semantic Locator is a semantic label or a system of labels. Depending on the point of view, a USL – the same object – designates a category or a graph of semantic relations between categories. Indeed, in IEML, a category is represented by a semantic network and vice versa. If a USL generates only one expression, we speak of a singular category. If it generates several expressions, it is a paradigm.
The USL is about linguistic meaning.
What is a UKG?
A UKG or Uniform Knowledge Graph is a virtual graph database whose semantic metadata system is a USL. Each subcategory of the organizing USL can contain data, references or literals. These data are structured according to the three parts listed below.
- A reference identifier (refid) that is used to trace co-references such as grammatical anaphors and cataphores.
- A data type (dtype) that indicates whether the value is a proper name, a number, a date, a GPS coordinate, a URL, a USL, a natural language (ISO 639), etc.
- A value.
The UKG is about referential meaning.