Why ChatGPT is Good News for the Future of Exorde Labs

May 16, 2024
Download this as pdf

You are standing at the end of a road before a small brick building. Around you is a forest. A small stream flows out of the building and down a gully.

> GO INTO BUILDING

I don’t know that word.

> GO BUILDING

You are inside a building, a well house for a large spring. There are some keys on the ground here. There is a shiny brass lamp nearby. There is tasty food here. There is a bottle of water here.

> OPEN BOTTLE

I don’t know how to lock or unlock such a thing.

> UNSCREW BOTTLE CAP

What?

> OPEN THE BOTTLE OF WATER AND DRINK FROM IT

What?

> DELETE GAME

We’ve come a long way since Colossal Cave Adventure was released in 1976. If you’re old enough to remember the golden age of text adventure games, you’ll also remember that most of the experience was banging your head against the wall while trying to figure out the exact words and phrases the program was designed to understand.

Now imagine those same games with the language capabilities offered by ChatGPT.

You would have the ability to use just about any coherent sentence structure to describe what you want to do, and the game would understand your intent.

Much has, rightfully, been made about the ability of ChatGPT’s ability to converse in natural-sounding language, but no less remarkable is it’s ability to understand the input of the user, without any requirement to follow specific rules of sentence structure. You can use slang, you can make typos, you can be quirky, and the software will, for the most part, understand what you’re trying to communicate.

This powerful effect is achieved by something called Large Language Models (LLM) — a neural network with the ability to train itself based on the input of huge quantities of information. LLMs are often rated by the number of parameters they have available (basically the number of values that the model can change as it learns), and modern versions can run to billions of parameters.

GPT-3 was the largest to date, with 175 billion parameters, but now GPT-4 is already out and running with over a trillion parameters.

All of which is very good news for Exorde Labs and future iterations of our application.

Structuring the Unstructured

“I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web — the content, links, and transactions between people and computers. A “Semantic Web”, which makes this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The “intelligent agents” people have touted for ages will finally materialize.”

Tim Berners-Lee, co-founder of the World Wide Web in 1990, was apparently fascinated by the idea of creating a semantic web in which different kinds of data (video, text, audio, images, etc.) could be machine-readable and therefore be processed faster and without the need for human guidance.

This would open up the web to a whole new world of possibilities. At a stroke you would have a mechanism for reducing administration, making smarter search engines, improving social media monitoring and countless other innovations.

Unfortunately, despite much effort, the semantic web never happened. You see, computers like structured information, the kind that can be placed neatly into spreadsheets. And the vast majority of the web is unstructured.

Just like the text adventure games of the past, computers are very good at interpreting data they’ve been programmed to understand. But as soon as you step outside of these limitations, the computer responds with a resounding: “What?”

This is why LLMs are such big news. We’re still a long way from the utopia of a semantic web, but the ability to understand unstructured text is a huge leap forward. In fact, without this technology, Exorde wouldn’t exist.

If you’re unfamiliar with the Exorde project, our goal is to crawl huge quantities of information on the web, isolate data about a specific subject, and then assess the sentiment of the content.

The crawling process is handled through a decentralised network consisting of tens of thousands of machines. Every machine performs a small part of the task in parallel, so collectively large quantities of information can be gathered and sorted in a relatively short period of time.

But all of this would be useless without the ability to sort and measure the terabytes of data we’re gathering. The only way to do this at speed is to utilise LLM technology to interpret the results. Right now, we’re using the BERT language model to great effect, but ChatGPT has given us a glimpse of how much more we’ll able to achieve in the future.

Progress is Exponential

Exorde has already produced some fantastic results. Such as an application that continually monitors chat around BitCoin and produces a sentiment analysis that can predict price movements with an accuracy of around 60% (this app is currently being updated and will re-launch in May 2023).

Creating a system that crawls countless millions of Reddit posts, tweets, news articles and blog posts is a huge achievement, and the sentiment analysis we perform simply wouldn’t have been possible even just a few years ago.

But there is still much progress to be made, especially in the field of language models. LLMs still sometimes struggle to identify things like sarcasm or humour. They can also be tripped up by local dialects, foreign languages and non-textual information.

Which is why ChatGPT is so exciting for us and every other tech company working in a field related to LLM. It’s given us a glimpse of what the next generation of natural language processing can achieve and what the generation after that may be capable of.

We’re still catching our breath over what GPT-3 can do, and now GPT-4 is already available to a small audience (Spoiler — it can take images as input in addition to text).

We’re not about to plug Exorde into GPT. The model is a work in progress and functions quite differently to the NLP system we’re already using. But we’d be lying if we said we weren’t already considering whether this might be a good move in the future and, if so, how much further we can stretch Exorde’s capabilities.

Every time this field of AI (NLP) improves, Exorde’s potential grows with it. It’s like a factory that gets better on its own because someone comes in overnight and upgrades the machines that produce our results.

We’re excited to follow the progress of GPT and other similar AI projects. The semantic web isn’t here yet, but we’re closer than ever to realising it.

If you’re curious and have a lot of patience, you can play Colossal Cave Adventure in your browser at https://rickadams.org/adventure/advent/