Towards a paradigm shift in Artificial Intelligence
IEML (the Information Economy MetaLanguage) is a language invented by Pierre Lévy that is simultaneously expressive like a natural language and regular like a mathematical language. Its phrases may embed data and can be generated and interpreted by algorithms.
Our first application, the IEML editor, will be used to build data models (or taxonomies, semantic networks, ontologies, knowledge graphs…). There are two main steps to writing an IEML data model.
- The generation of nodes (semantic addresses of data)
- The programming of links (semantic relations) between nodes
Both nodes and links are IEML sentences translated in natural languages, and both can be created automatically.
The following text explains the role that IEML may play in the future development of artificial intelligence.
Current obstacles in the field of artificial intelligence
Although AI has made considerable progress since the middle of the 20th Century, many obstacles still stand in the way of effective, democratized AI working for human development and collective intelligence. Artificial intelligence (AI) is currently divided into two branches. Symbolic AI specializes in conceptual modeling and automatic reasoning, while neural AI (machine learning, deep learning) excels in learning from examples and automatic categorization.
In symbolic AI, logical modeling is costly in human effort. On the other hand, neural AI requires less human effort but is limited by a variety of problems: impossibility to distinguish causes and effects due to an exclusively statistical approach; inexplicable results; an impossibility to generalize beyond the training data; a blindness to meaning despite advances in translation and automatic text generation. Moreover, we can already observe diminishing cognitive returns on the augmentation of the size of models, datasets, and computing power.
Finally, all of AI today is compartmentalized. Difficulties in accumulating and exchanging knowledge plague both neural and symbolic approaches. In opposition, human intelligence integrates, adds, and swaps knowledge naturally by using language. Humans have an innate understanding of linguistic semantics. But contemporary AI cannot use natural languages the way humans do because of their ambiguity, and it does not provide, as of today, the equivalent of a human language (with all its sophistication) for machines to use. How is artificial intelligence supposed to model human cognition without a mathematical model of human language, including its semantic aspects?
The reality is that the main obstacle to further substantial developments in AI is the lack of a common computable language. This is precisely the problem that has been solved by IEML, which has both the ability to express meaning, like natural languages, and the unambiguous and computable nature of a mathematical language. The use of IEML will make AI less costly in human effort, better in handling causality and meaning, and above all, able to accumulate and exchange knowledge.
Semantics governs the organization of memory, the coherence of reasoning and the content of any communication. Therefore, its importance in human intelligence cannot be underestimated.
The main function of our metalanguage IEML is none other than to automate an essential dimension of human intelligence: semantics processing, i.e., the meaning of words and sentences. If artificial intelligence is basically a mechanical simulation of human intelligence, then automating semantic processing would be a major step forward in its development. But what are semantics, anyway?
The role semantics play in linguistics and cognition
From the point of view of the scientific study of language, the meaning (or semantics) of a word or a sentence can be broken down into two parts that are mixed in reality, but conceptually distinct: linguistic semantics and referential semantics. Roughly speaking, linguistic semantics deals with the relationship between words, while referential semantics is all about the relationship between words and the things they represent.
Linguistic semantics or word-word semantics. The meaning of a symbol (word or sentence) is based on the language to which it belongs, its grammar and its dictionary. In a classical dictionary, each word is related to other words with a similar meaning (thesaurus) with definitions using words that are themselves explained by other sentences, and so on, circularly. For example, the word « tree » means « a woody plant of variable size, whose trunk grows branches from a certain height ». Common nouns and verbs (for instance: tree, animal, limb, eat) represent categories that are themselves connected by a dense network of semantic relations such as: « is a part of », « is a kind of », « belongs to the same context as », « is a cause of », « is prior to », etc. We can only think and communicate as humans because our personal and collective memories are organized by general categories connected by semantic relations.
Referential semantics or word-thing semantics. In contrast to linguistic semantics, which links symbols together, referential semantics bridges the gap between a linguistic symbol and a referent, i.e., a real thing or an event. If I say: « That tree in the yard is a maple tree » then I am pointing to a reality. This statement involves linguistic semantics because we need to know the meaning of each word, and English grammar to understand it. In addition to a linguistic dimension, there is a referential semantics at play since the statement refers to a particular object, in a concrete situation. Some words, such as proper nouns, have no linguistic semantics at all, and have only referential semantics. For example, the signifier « Alexander the Great » refers to a historical figure and the signifier « Tokyo » refers to a city.
The role semantics play in computer science and AI today
In computer science, references or individuals (the realities we are talking about) are the data, while general categories are the fields, or metadata, that are used to classify and retrieve data. For example, in a company’s database, « employee name », « address » and « salary » are categories or metadata while « Tremblay », « 33 Bd René Lévesque » and « 65 K$ per year » are data. In a nutshell, referential semantics corresponds to the relationship between data and metadata and linguistic semantics to the relationship between metadata (or organizing categories).
To the extent that the main purpose of computer science is to increase human intelligence, one of its tasks is to make sense of the flood of digital data by extracting the maximum amount of actionable knowledge from it. To do so, data has first to be categorized (corresponding to referential semantics: word-thing) – then categories have to be organized according to relevant relations (corresponding to linguistic semantics: word-word).
Why is present-day AI unable to grasp linguistic semantics?
In human intelligence, the meaning of words and their interrelationships is a given in any current spoken language. Although natural languages such as French, English, or Mandarin are irregular and ambiguous, we are skilled as humans at interpreting their expression according to their context. But computers are not embodied and sensitive like us. Therefore, it is not enough to give a computer a grammar book with rules and a dictionary to be able to understand a natural language. Because a word can have several meanings, because a meaning can be expressed by several words, because sentences have several possible interpretations, because each different grammar is elastic, computers are not able to correctly interpret statements in natural languages. In fact, computers do not see a word or a sentence as concepts in a determined relation with other concepts in the framework of a language: as of today, it only reads (and interprets) sequences of letters or « strings of characters ».
Therefore, all the relationships between categories that seem so obvious to humans by falling under linguistic semantics, must be added, and connected by hand in a database if a program is to take them into account.
This is also why machines are still unable to understand the meaning of the texts they translate or write. Admittedly, performances in translation (see Google Translate and DeepL Translator) or automatic writing (as illustrated by the GPT3 program) are improving. But the algorithms for automatic translation from language A to language B are strictly based on statistical correspondences between texts from language A to texts from language B. And the neural networks doing the translation resemble the brain of a mechanical parrot, only capable of imitating linguistic performances without having any idea of their content.
Also, a computer doesn’t “learn” anything from what it translates because it doesn’t ”understand” the meaning of what it’s translating. Consequently, a computer is unable to process and feed by itself a database with any knowledge contained in the material it has just translated.
For its human speakers, a natural language extends a net of general categories that explain each other. This common semantic network allows to describe and communicate both the multiple concrete situations and the different domains of knowledge. But, because of the limitations of machines, AI cannot make any natural language play this role. This is why it remains fragmented today into siloed micro-domains of practices and knowledge, each with its own particular semantics.
Our invention: IEML, the Information Economy MetaLanguage
Introducing semantics into AI
IEML has been precisely designed to solve all the semantic problems currently encountered in AI. IEML can be used as a common framework for data modeling and knowledge representation by allowing communication – including full linguistic semantics – between AI systems, between humans and AI systems, and between all natural languages.
Moreover, IEML is programmable, therefore reducing the time needed to design data models by making them highly modular and reusable. Let’s review the unique IEML’s features that makes this possible:
IEML has the same expressive power as any natural language. It can manifest, evoke, and translate any category, concept, sentence, or text. Note that IEML is neither a universal ontology nor a particular classification, but a language able to express all ontologies and classifications, making them interoperable.
IEML is also a regular language. It has a grammar with mathematical properties, a dictionary made of three thousand words built from elementary semantic blocks and organized in theme-related tables called paradigms.
IEML semantics are univocal. Every sentence has only one possible meaning, and this meaning is computable because it is automatically deduced from the IEML dictionary and grammar, which is not the case in natural languages where meaning is ambiguous.
The IEML editor: our first software
IEML, the metalanguage of the information economy, is a language defined by its grammar and its dictionary. The IEML editor (under construction) is a software used for producing data models (by model designers) to be explored (by end users).
Data models built with IEML encompass: semantic networks, semantic metadata systems, ontologies and knowledge graphs. The IEML editor can be used to design labeling systems for training datasets, making machine learning more coherent and more efficient. This editor includes a programming language that automates the creation of categories (nodes) and semantic relationships (links) making this language declarative.
Contemporary ontology editors organize logical relations between categories, one by one. The same way we create semantic relations between words as we speak, the IEML editor innovates by creating sentences to express the semantic relations between categories. Since IEML models can turn data into narratives, it makes extracting knowledge from the data easier and more effective.
IEML editor features
Three fundamental features distinguish the IEML editor from current tools used to model data or create ontologies: categories and semantic relations in IEML are programmable, and the resulting models are interoperable and transparent.
Categories and their semantic relations are programmable. The regular structure of IEML allows categories to be generated, and the semantic relations between them to be woven in a functional or automatic way instead of creating them one by one (by hand). This property saves the designer considerable time.
Models are interoperable. The particularity of IEML is that every model is built from the same 3,000 word dictionary, where each word has a unique and unequivocal definition, and a regular grammar. Therefore, this distinctive factor makes all models interoperable, where categories and sub-models can be easily shared. Each model does remain context-specific but can be compared, interconnected or merged.
Models are transparent. As opposed to words we use in general that may have many different meaning, therefore bringing ambiguity, all categories and semantic relations in IEML are uniquely and explicitly defined. IEML models are simple, easy to navigate and self-explanatory to both designers and end-users. This makes all IEML models consistent with contemporary principles of ethics and transparency.
End users will be able to explore the modeled content through data mining, hypertextual exploration and visualization using graphs and paradigmatic tables.
Also, all models coded in IEML are accessible and readable in natural languages. By using the IEML editor, non-computer scientists, or laymen, will be able to design data models easily, after a few learning sessions. In the future, IEML could possibly be taught in schools and open the way to a democratization of data modeling.
The future: neuro-semantic AI systems
We believe that the future of AI is neuro-semantic, i.e., built from the right combination of neural machine learning and IEML-based semantic modeling and computing. The IEML editor opens the way for new generation of AI systems combining the strengths of both branches of AI, neural and symbolic, while enabling knowledge integration and interoperability. Some obvious applications for our neuro-semantic AI systems are data integration, decision support based on causal models, knowledge management, text comprehension and summarization, controlled text generation, chatbots and robotics.
Semantic Interoperability at the Service of Collective Intelligence
At INTLEKT, we create semantic metadata systems fitting your organization’s needs with a focus on complex human systems like software, games, health, the environment and urban phenomena.
Benefiting from our unique patented technology IEML, the Information Economy MetaLanguage, complex models becomes explorable, interoperable and can be translated automatically in several natural languages.
Our lead consultant Pierre Levy, Ph.D., Fellow of the Royal Society of Canada, has a deep experience in knowledge engineering. He has published thirteen books translated in twelve languages, including the titles Collective Intelligence, Becoming Virtual and The Semantic Sphere exploring epistemological and anthropological aspects of digital technologies.