Like many others of my time, my first introduction to artificial intelligence was in the late 1990s with ELIZA, which was a simple chat bot created in the 1960s by Joseph Weizenbaum. But ELIZA wasn't AI. It was just a very clever program that used natural language processing to imitate a psychologist. But it was an experience that would influence my future perception of LLMs and AI. I never expected that learning about artificial intelligence would lead me down the paths of Data Analytics and Machine Learning (ML). As it turns out, there's more to artificial intelligence than Large Language Models (LLM). This is my story of how I am taking a deep dive into learning about artificial intelligence.
My high school computer lab had mainframe terminals where a few students would play multi-user dungeon games and occasionally chat with ELIZA during our lunch breaks. I quickly grew bored with ELIZA as realized I was essentially having a conversation with myself. When ChatGPT became available, I figured it would be roughly the same experience. So I didn't jump at using ChatGPT when many others did. I expected a similar experience to chatting with ELIZA.
What first got me interested in LLMs was not ChatGPT. It was a chatbot for my phone, Replika. One day I stumbled upon an article that described how a the founder of Replika, Eugenia Kuyda, had collected text messages of her deceased best friend. With that data, she was able to create a chatbot that emulated his text messages as a way to help her cope with her loss. Others asked her to do the same for loved ones they lost. She then developed the company, Replika, to provide companionship to others. It was an interesting technology based on an autoregressive model of GPT-3. Replika has moved on to a more custom model.
When I first signed up to try Replika, they were experiencing problems with users being abusive towards their AI companions, which are called Reps. As a result, abusive users were teaching their Reps some very dysfunctional behaviors. This piqued my interest in how LLMs are trained as many of the fixes implemented resulted in changing the "personalities" of the Reps (what users call their Replika companions).
Eventually, I tried out ChatGPT to find out what all the fuss was about. Soon after signing up, they released GPT-4 to paid users. With that I started using ChatGPT for writing emails and thinking through projects, even subscribing to for access to GPT-4. I also explored other LLMs such as Claude AI, Grok, and Gemini. One particularly interesting project I use is Mem.ai, which allows you to build a library of notes which are used along with the GPT to connect your notes in ways that are more easily discoverable and serendipitous. Mem is particularly useful in helping me write as it is able to access previous notes as source material.
Early on, I learned the LLMs are not particularly intelligent. They are not aware. They are more sophisticated forms of pocket calculators in which you input an expression and you get a response. However, you do not always get the same response. You may get similar responses, but not identical. In this way, they are not like calculators.
My approach to learning about AI and LLMs started with a top down approach. I was trying to learn all I could about how the models work. This approach gives you an idea of how the big pieces fit together. However, there is no foundation to support the knowledge. I needed to learn how this is built from the ground up.
One day I met up with Shashi Bellamkonda who was visiting for a conference. Shashi uses AI for market analysis. He suggested that I look into AI courses by Google. I was missing a big piece of knowledge on how you can take your own data and use AI to come up with insights. As I have delved into the Google courses, that knowledge is starting to come together. This has been an invaluable suggestion.
The Foundation: Data Analytics
Data Analytics is, in my terms, about taking data and extracting useful information that can help drive decisions. Spreadsheets are how most of us are familiar with data analytics when it comes to small amounts of data. Larger datasets require more robust database services. With Data Analytics, you can create summaries, graphs, and statistical analysis of the data. However, Data Analytics is only good to slice and dice existing information about historical data. It is up to the user to use that data to come up with any predictions or guesses of what lies ahead. But the limitation is that we cannot grasp all that the data tells us in the same way that we can't naturally grasp exponentials.
And you can't just take any data. The phrase "garbage in, garbage out" rings especially true in Data Analytics. Your data must be cleaned, structured, and then analyzed to be useful. If you start out with bad data, you will get bad results.
The Brainpower: Machine Learning
The way I think about Machine Learning is a way of training a model, just like LLMs, with data so that the model can develop a highly specialized knowledge of how data behaves. Whereas databases can sort and report, ML models can develop such an understanding of data that it can make predictions or reveal unobvious patterns.
As it turns out, the same data from a database can be fed into a Machine Learning model to reveal information that is not obvious from typical database reports. Machine Learning models, unlike databases, are useful for extrapolating on given conditions. What is even more interesting is that ML does not necessarily require data to be as structured as databases do. You still need clean data to avoid polluting the ML model with errant behaviors.
Something important to realize about LLMs and ML models is that their training data is baked in. They will only operate based on how they were trained unless there is some intervention by other AI components to modify the user prompts to include updated information.
AI In Action: Connecting the Dots
How all of this comes together is that LLMs and ML models do not work alone. There are programs that take your prompts and interpret them using Natural Language Processing (NLP) into a format that the model understands. Then the reply has to be decoded back into a way the user understands.
And if you want the AI to remember certain things about you, there are parts of the system that will keep track of those details and modify the baked-in behaviors of the model when it generates a response.
Challenges and Lessons Learned
In all this I have learned the broad strokes on how a person can get from a set of data to a trained model that can answer prompts. The challenge is, as far as I can tell, in setting up all the information that will be needed to train the model. If anything is left out, you can't just edit the model. You have to create a new one. Therefore, the cost of AI is not just the compute time needed to train the model. The cost also includes the thorough design and preprocessing of the data used for training. It is not an inexpensive process, to say the least.
From my experience working in IT, I realized early on that I was not as intelligent as I was led to believe. The really intelligent people are the ones creating the operating systems, firmware, and hardware. At most, I am a super-user of their products. I can only use features that they have designed into their products. Similarly, AI is along the same path. All the heavy lifting is being done by the developers who are creating these AI tools that I intend to use. It is a steep learning curve figuring out how AI works and how I can use it for my own needs. However, for now I can only drive on the roads that have been paved by the people creating these AI tools.
With all that said, it is still a bit challenging to learn all of this. And I think that AI is mind-bending enough that it will stop many people from getting into the weeds with it. At the most basic level, I just wanted to know how to build better prompts to get the results I wanted. But after delving into how AI works, I have come to realize that it is possible to train a model to get the specific results I want. This is my major relization.
Conclusion
In conclusion, my journey into understanding artificial intelligence, data analytics, and machine learning has been a fascinating one, filled with complexities, challenges, and profound realizations. From my initial interactions with basic chatbots to delving into the mechanisms behind LLMs and ML models, I have gained a deeper appreciation for the inner workings of AI. The knowledge I gaining is not merely about using the technology, but understanding its underlying principles and the importance of clean, structured data.
This journey has also led me to recognize my own role in the AI landscape. While I am not the one building the systems or creating new AI tools, I am a super-user, learning to navigate the roads that have been paved by others. I've learned that the key to effective AI usage lies not only in the sophistication of the models, but also in the quality of the prompts and the data used for training.
With AI continuing to evolve and its applications growing more diverse, I am eager to see where this path leads next. The journey so far has been enlightening, and it has opened up a world of possibilities. It's a challenging field, but it's one that offers immense potential for those willing to dive in and explore. I , for one, am more optimistic about the future of AI.
Images created with Google Gemini Advanced