Part 6: What string theory reveals about AI chat models
Part 6: What string theory reveals about AI chat models
This post is part of the AI Apprenticeship series:
- Part 1: AI Apprenticeship 2024 @ DiploFoundation
- Part 2: Getting introduced to the invisible apprentice – AI
- Part 2.5: AI reinforcement learning vs human governance
- Part 3: Crafting AI – Building chatbots
- Part 4: Demystifying AI
- Part 5: Is AI really that simple?
- Part 6: What string theory reveals about AI chat models
- Part 7: ‘Interpretability: From human language to DroidSpeak’
By Dr Anita Lamprecht (supported by DiploAI and Gemini)
In Diplo’s AI Apprenticeship online course, we are now delving deeper into the theory behind the DiploAI chatbots we built. So far, we have concentrated on gaining a solid understanding of the core concepts and terminology needed to build our chatbots. The aim of our apprenticeship is to enable diplomats to utilise chatbots in their profession and provide them with the necessary AI literacy for negotiations on the topic of AI.
A vital part of this literacy involves an awareness of what we should and can know (‘the knowns’) and what we cannot know (‘the unknowns’). Large language models (LLMs) are a type of neural network, and we should know that such networks have an AI ‘black box problem’. This means there is a phase in the entire process that nobody fully understands. According to our course lecturer, Jovan Kurbalija, this known unknown (i.e. the black box problem) is often utilised or misused in dystopian AI narratives about the dangers of AI and artificial general intelligence (AGI).
Just as physicists grapple with the ‘known unknowns’ and even ‘unknown unknowns’ of the universe, we, as AI apprentices, are learning to navigate the complexities of these powerful language models. Sometimes, the most unexpected connections emerge, like finding a string theory metaphor hidden within the universe of AI.
Yarn, string theory, and AI?
Take, for example, our recent exploration of how LLMs store and represent language. To illustrate this, our lecturer used strings – yes, actual pieces of yarn! This seemed a bit odd to me, but reminded me of an article I read recently about string theory.
String theory is a complex concept in physics that attempts to explain the universe as being composed of tiny, vibrating strings. I recently listened to a fascinating interview with American physicist and professor Leonard Susskind, one of the pioneers of string theory. Surprisingly, he stated, ‘String theory is definitely not the theory of the real world.’ While he acknowledged that string theory offers a compelling framework, there is still much we do not know about whether it accurately reflects reality. Susskind also noted that we do not know if string theory will help us uncover these mysteries and that we remain uncertain about what ‘it’ truly is.
A web of words
Jovan’s version of string theory is much simpler. He uses strings to illustrate what we can know, a crucial aspect for making informed decisions about AI governance and its application in governance. Picture our classroom transformed into a web of interconnected words. We, the apprentices, became nodes, each holding a string that represented a word. The connections between us symbolised the relationships between those words. For example, ‘diplomacy’ might be closely linked to ‘treaty,’ ‘negotiation,’ and ‘ambassador,’ while ‘sanction’ might connect to ’embargo,’ ‘resolution,’ and ‘compliance.’
Suddenly, language is no longer a simple linear sequence of words. It becomes a dynamic, multidimensional network, similar to the complex web of interactions described in string theory. The ‘strings’ represent the probabilistic positioning of words – the likelihood of one word appearing in proximity to another. This is how large language models (LLMs) learn to generate human-like text: by navigating this complex web of relationships and probabilities.
Layers, dimensions, and a bit of confusion
Our lecturer introduced a new concept that can feel a bit complex at first. He explained that words in AI are connected within something called a ‘high-dimensional vector space’. Think of this as a multi-dimensional map where each word is represented as a point, and their positions show how closely related they are.
These points, called ‘vectors’, are built using mathematical formulas that capture the relationships between words. You can imagine these as layers of meaning. Each layer highlights a different aspect of how a word relates to other words – like its meaning, context, or how often it appears near another word.
In simpler terms:
- Vectors in AI: These are like coordinates that help AI organise and process data.
- What makes up a vector: It’s just a list of numbers (called ‘floating-point numbers’) that shows where the word is located in this multi-dimensional space.
To make this easier to picture, imagine holding physical strings that connect words. These strings represent the relationships between words, like how ‘apple’ connects to ‘fruit’, or ‘king’ connects to ‘queen.’ Visualising it this way helps you see how words are related in a space that has many dimensions .
Jovan’s ‘string theory’ is a fantastic way to bring these abstract ideas to life. By physically showing how words connect, he makes it easier to grasp concepts that would otherwise remain stuck in mathematical definitions.
The AI Apprenticeship online course is part of the Diplo AI Campus programme.
Related blogs
Related events
Related resources
Subscribe to Diplo's Blog
The latest from Diplo and GIP
Tailor your subscription to your interests, from updates on the dynamic world of digital diplomacy to the latest trends in AI.
Subscribe to more Diplo and Geneva Internet Platform newsletters!
Diplo: Effective and inclusive diplomacy
Diplo is a non-profit foundation established by the governments of Malta and Switzerland. Diplo works to increase the role of small and developing states, and to improve global governance and international policy development.
Leave a Reply
Want to join the discussion?Feel free to contribute!