Generative AI: The Promise — and a Path Forwards

A New World Opens Up

Aug 10, 2024

Generative artificial intelligence is one of the greatest inventions in human history.

(This newsletter is focused on large language models in general, and generative pretrained transformers — GPT-style systems — in particular, though I touch on all kinds of generative AIs, and their impact on the world, when relevant.)

Does this sound like hyperbole? Consider just three applications of this technology — generative AIs can:

Generate code — reliably generate working code for the vast majority of simple or moderately complex algorithms (example, generated code for example)
Translate languages — translate text from any (reasonably common) human language to any other (reasonably common) human language with high accuracy and with a fairly good grasp of cultural and grammatical nuances (example)
Retrieve knowledge — retrieve (reasonably) accurate answers to the vast majority of simple queries involving common knowledge, phrased in straightforward human language (example)

Any one of these three examples would, if implemented as a stand-alone application, count as a revolutionary piece of technology. Each one will almost certainly have world-shattering implications:

Reliable automated code generation will transform the economics of every business and organisation that produces software, or that makes money from custom software.
Reliable human language translation will make language barriers obsolete, and enables communication between any two human beings with access to the internet.
Reliable answers to natural-language queries will transform every organisation that organises knowledge, and the educational journey of every student around the world with access to a computer.

But humanity has only just begun to scratch the surface of what is possible with this new technology. Over the next ten to fifteen years, we may well see two or three dozen more applications with an equal or greater impact to the three above.

An obvious caveat: LLMs, even the most advanced systems, are not perfectly accurate or reliable. Their output cannot always be trusted and it is relatively easy to spot mistakes or to “catch them out” — one way or another.

In simple terms: we are not quite there yet.

However, the reliability of these systems is notably superior to that of equivalent earlier systems (as can be demonstrated by comparing ChatGPT’s translation abilities to that of Google Translate), and they are improving all the time.

Most crucially: the reliability of these systems can be arbitrarily improved by “augmenting” them with conventional software — or, more precisely, by augmenting an system built from conventional code with automated prompts dispatched to GPT or other LLM systems.

A central thesis of this newsletter is as follows:

The skilful implementation of prompt-augmented engineering is a master key necessary to unlock the full potential of generative artificial intelligence. Prompt-augmented engineering — also known as prompt pipeline engineering — will enable humanity to access the full capabilities of generative AI — and to do so with as much flexibility, reliability and security as needed for any given use case.

In future posts, I will investigate in detail the theory, practice and potential ramifications of prompt-augmented engineering.

I will also seek out the other master keys that will enable you to unlock the greatest possible value from this new technology; and to enable you (and the people around you) to ride the wave of the LLM revolution and to achieve all of your hopes and dreams.

Is this possible?

Why not?

If humans built LLMs, why shouldn’t humans use LLMs to achieve human goals?

To return to the original claim: the seismic impact of the generative AI revolution is, in many respects, an aftershock of the world-shattering impact of the computer and internet revolutions.

The implementation of universal Turing machines in physical reality — the invention of the first microprocessors, CPUs, and so on — unlocked the first wave of algorithmic automation. The development of packet-switched networking enabled virtually free international communication — and ultimately led to a single world-embracing network connecting billions of Turing machines into an always-online, continually-communicating electronic matrix.

Over the last five decades, the appearance of personal computers, of hypertext, of smartphones, and a whole series of other secondary inventions — all following on from the primary inventions of the computer and the internet — has transformed the world.

The primary “information revolution” has, in reality, consisted of an ongoing series of revolutionary waves. Large language models are, in many respects, nothing more, and nothing less, than the latest wave in the series.

With that in mind, is there anything special about generative AI, that distinguishes the present wave from previous waves?

Two things stand out:

Firstly, generative AI systems gather together all of the novel capabilities that have been brought into existence in the five decades of the computer and internet revolutions, and unite them into a single-point solution.

Secondly, generative AI systems (and GPT-style systems in particular) function as multi-purpose tools that can be shaped and moulded to fit a highly diverse array of potential use cases.

Let me repeat the first point for emphasis:

Generative AI systems gather together all of the novel capabilities that have been brought into existence in the five decades of the computer and internet revolutions, and unite them into a single-point solution.

This means that:

There is a vast ocean of software tools that are potentially enormously valuable, but that are practically greatly underutilised, whether owing to technical issues, social issues, or other barriers. (Dropbox is a striking example — similar tools existed prior to its launch, but were inaccessible to the vast majority of users. Packaging up existing technology and bringing it to the masses turned out to be a huge economic opportunity.)

LLMs have an extremely strong “iceberg effect” — 98 or 99% of their capabilities lie in their ability to connect to existing (and underutilised) solutions. (A simple technical example is the ability of ChatGPT to quickly generate glue code to connect different APIs, which greatly reduces the friction in integrating disparate pieces of software.)

There is potentially enormous economic value in creating the one-point solution — the single point of access that most efficiently connects user needs to practical solutions.

As for the second point:

Generative AI systems (and GPT-style systems in particular) function as multi-purpose tools that can be shaped and moulded to fit a very wide array of potential use cases.

This point is of more relevance to people using LLMs as components of larger systems — such as those applying the strategies of prompt-augmented engineering, and/or prompt pipeline engineering, mentioned above.

I will give two examples here — one technical, one non-technical:

Technical example: GPT can effectively emulate an arbitrary API. You can send arbitrary queries to GPT, and receive responses structured exactly as you specify, to any level of granularity. (Example.) Data contained within one response can be used to formulate new queries. (Example.) If wrapped in conventional code that calls OpenAI’s own API, these “virtual API endpoints” can be called programmatically from another piece of software, and effectively treated as a regular API.

(And, yes, GPT can generate code that calls these virtual APIs, effectively enabling the system to call itself programmatically; however, “self-referential code generation with GPT” is a topic that deserves at least its own article, and could easily fill several.)

Non-technical example: GPT can fulfil arbitrary knowledge-related, or language-related, “micro-tasks”. For example, you can enter an arbitrary topic and ask for a list of topics that are related to the original topic. (Example.) You can enter an arbitrarily complex sentence, have it broken down into simple sentences, and then have the subject, verb and object for each simple sentence identified. (Example.) You can enter a list of arbitrary words and phrases and have them categorised both linguistically (as nouns, verbs, adjectives, and so on), or semantically. (Example.)

In both of these cases, GPT is effectively being used to generate lego bricks of arbitrary shape, which can then be used to build larger systems — larger systems which can serve many different kinds of purposes.

For example: with a chain of “virtual API calls” one could populate a complex data store filled with any kind of information that was available: geographic, economic, or otherwise. A sequence of knowledge-based micro-tasks could be used build up a structured knowledge base from the unstructured text in books and articles.

(In both cases, you could either use GPT to process outside sources of information, or you could attempt to “milk” the language model itself for implicit knowledge. Many of the examples above use the latter technique — it is surprisingly effective, though it is obviously not remotely reliable as a general approach. However, augmenting GPT with more reliable external data sources looks likely to be a very powerful technique with many applications.)

The paths opened up by such techniques are extremely broad: generative artificial intelligence reconnects and reintegrates all of the new possibilities that have been opened up by computers and the internet over the last fifty or sixty years.

Generative artificial intelligence opens up a whole new world, though much of it — perhaps the vast majority — remains shrouded in dark clouds and deep mists.

—

It is rare to end up stuck when building something on top of generative AI systems — there is almost always a way to get to where you want to go.

The key question, then, is how to decide where exactly one wants to go.

Elite Prompt Engineering

Discussion about this post