Darek: Hi, Gerard. How are you?

Gerard: Hello, Darek. I’m very good thanks!

Darek: please tell us a bit about yourself.

Gerard: my name is Gerard Sans. I’m a Developer Evangelist for Web and Cloud. I’ve been doing some research on Artificial Intelligence and Web3 for the last year and I’m here to share some of my learnings with you.

Darek: there’s been a lot of discussions around how AI is going to replace some jobs. What do you think about that?

Gerard: these are very exciting times. We’ve seen a lot of improvements in AI in the past few years. Some of them will certainly change how people do their jobs. The most significant changes have happened in text-to-image generation. This is where an AI model generates images from scratch by using a prompt. We’ve seen amazing progress in this area. You can easily generate illustrations or even photorealistic images, and it keeps getting better and better. A lot of designers and digital artists are seeing how their jobs are suddenly in the spotlight. At this time, it’s not clear if people will care if an artwork was created by a digital artist or an AI.

This white t-shirt doesn’t exist, it was created by DALL-E.

Darek: which jobs are more likely to be affected: writers, artists, or developers?

Gerard: AI’s latest developments have been very surprising for me. Initially, I was in awe looking at some of the text generated by AIs. This technology is very similar to what I explained for image generation but writing. These AI models have been trained using a very large amount of data including publicly available books from the last few centuries. So these models can generate text in all the styles that you can imagine — commercial books, self-help, cooking recipes — but also philosophy, math, and all of the sciences. If you think about important authors like Shakespeare or other famous authors in history, their styles are also embedded in these models. So today, you can effortlessly generate text that resembles any of the most famous writers in the world. Like DALL-E generating an image in the style of Van Gogh. This is not just limited to literature but also includes all sorts of specific domains like marketing.

For example, if you are a writer, you could use these AI tools to avoid what’s called writer’s block. You can get inspiration from a history that you are stuck on and get a fresh start following your same style. People are using this technology to generate anything from blog posts to cooking recipes and anything in between. It’s quite exciting but also quite uncertain how it will affect jobs like technical writers or even novelists. These jobs are all in the spotlight. Most people know about these AI technologies but they don’t think that they will be affected by them.

Most people know about these AI technologies but they don’t think that they will be affected by them

It was only last year that OpenAI released Codex, which is the model behind GitHub Copilot a product by Microsoft. Codex can write code from scratch but also introduce changes following commands. This is anything that you need as a developer. You can write code and then iterate over it as many times as you need. These are only a few examples of what these tools are capable of today. So yeah, it’s looking like we are facing the birth of a new era of AIs.

Using Codex you can specify what you need in a comment and it will create the code for you.

Darek: what is OpenAI?

Gerard: OpenAI is a company that was founded by Elon Musk, Sam Altman, and Greg Brockman. Their mission is to build a safe Artificial General Intelligence (AGI) that benefits all humanity.

OpenAI was founded back in 2015 as a non-profit company. It took them a few years until 2018 to create the first GPT model which is a Generative Pre-trained Transformer. It is also known as a Large Language Model (LLM). They published a public API in 2020 which is when everyone started playing with this technology and started discovering its many applications. The first applications were focused on writing but very soon they realised that GPT-3 was able to do things like writing valid JavaScript. That was when they started working on Codex which was released to the public in 2021. This is a new technology only been around for a few years and is just fantastic.

NLP models up to 2022 (not included)

Darek: what products are being offered today by OpenAI?

Gerard: there are three different products available from OpenAI.

The first is GPT-3, a model that can write text indistinguishable from human writing. It works by providing a starting text, a prompt, and GPT-3 will keep writing in a way that matches the existing text. For example, you can use it to create a story, or continue a conversation between characters. Some of these capabilities were not planned at all by their creators. This model was created to generate text by predicting the next word following a prompt, but then they realised that the model could identify more complex patterns, like a series of questions and answers, or even a general conversation. During the beta access, users found out that GPT-3 was able to do a lot of different things. You can, for example, ask for cooking recipes and it will give you the ingredients and how to cook them. That’s not text generation anymore, in the sense of writing a story or writing a book, but it’s more like identifying a question and providing an answer to that question. During the initial stages, a lot of different applications were identified. One of these was that it could write HTML, CSS, and JavaScript.

This led to the creation of Codex, a product specialising in coding. This is when OpenAI partnered with Microsoft, getting access to 59 million GitHub repositories. The outcome was that it learned how to use 12 different languages, including SQL and Bash commands.

Finally, DALL-E is the one I started talking about at the beginning.

Darek: what is GPT-3 trying to solve?

Gerard: GPT-3 is a Natural Language Processing (NLP) model that generates text. GPT-3 can be given a text input and then generate any number of words. Usually, a stop sequence is used, so that GPT-3 will stop after the first paragraph. However, you can create complex patterns, such as a question-and-answer text, and GPT-3 will follow that. You can also ask for instructions and it will provide the instructions to achieve a goal, such as cooking a recipe.

Most common usages for GPT-3 by OpenAI.

Darek: Cooking recipes created by an AI !?!

Gerard: yeah. It’s quite an exciting development. The idea behind it is that these models are very large. GPT-3 is 175 billion parameters large. That means it can generate more combinations than atoms in the universe. So this is a very big model that we cannot begin to imagine. And of course, one of the things it contains is cooking recipes. But this is not the most surprising thing that you can find inside these models.

The model can code in JavaScript! But it doesn’t stop there. Now it has also learned Typescript, Python, Ruby, Go, C sharp, Swift, Rust, PHP; and all of these languages after just a little bit less than half a terabyte of extra training.

Probably the weirdest thing is that it can speak more than a hundred languages, even though it was only trained in English. This is very surprising. In some languages, GPT-3 was able to learn using less than one percent of data. We don’t know how that happened. It was only supposed to learn English.

Training Dataset for GPT-3.

One of the largest sources of information was the Common Crawl. These are the top websites and all their accompanying raw files. That’s how it learned JavaScript because a lot of its content is HTML, CSS, and JavaScript files. But then it also took information from hundreds of years of public books (Books1–2). And funny enough, from Reddit too (WebText2). So there’s a lot of information from users and the links that these users provided. So anything you can imagine is in there.

The Common Crawl corpus contains petabytes of data collected over 12 years of web crawling.

Another resource was Wikipedia. So anything that is in Wikipedia is also part of the training data for GPT-3. It’s a massive amount of data. It can do things like translate between languages. So you can write something in French and it can give you the translation in Polish.

If you look at the complexity of a language like English, then you can understand how it’s able to learn JavaScript, because JavaScript is much simpler than English.

Darek: Yeah exactly.

Gerard: If you’re wondering how it could learn Japanese with almost no information, you’re not alone. This is just the beginning of what AI is capable of, and we’ve only seen a fraction of its potential.

GPT-3 is so large that it would be impossible for any one person to explore its entire latent space. What we’ve seen so far is just a drop in the bucket. For example, GPT-3 being able to generate cooking recipes is a small feat compared to what it’s capable of.

latent space is a mathematical space that maps what a neural network has learned from training

It’s been more than half a year since GPT-3 was first used in production with GitHub Copilot, and it’s still going strong. This shows that it’s not only capable of understanding JavaScript but also able to provide code to professional programmers who are building all sorts of applications.

In short, we haven’t seen anything yet. GPT-3 is just getting started, and its potential is massive.

Darek: what is Codex main purpose?

Gerard: the main purpose of Codex is to generate valid code. As we have explained before, it is capable of writing code in many different programming languages, which makes it a valuable tool for developers. However, it is important to keep in mind that Codex was trained with a snapshot of GitHub repositories from last year. Anything after that snapshot, will not be aware of. The reason for not being trained regularly is that the process is very expensive and requires massive amounts of computing.

Main languages supported by Codex.

Darek: what is DALL-E and how people are using it?

Gerard: DALL-E is a model from OpenAI that is trained to generate images from a prompt. It is by far the most popular. DALL-E is everywhere on social media, from TikTok to Twitter. We have seen how OpenAI models learned to generate text and code, and this is the same for images.

Everything that GPT-3 knows, DALL-E knows. This means that it’s a writer, it’s also a developer and now it has also data around images. DALL-E was trained with 400 million images along with its metadata (title, description, creation date, camera, lenses, geolocation, etc). This allowed DALL-E to establish relationships between images and words. As a result, it can generate any sort of image following a prompt very closely. You can either generate an image from scratch or edit images by giving it instructions.

For example, one of the experiments that I did recently was to create a photorealistic picture of a woman using DALL-E. I provided information about the composition, how the model was positioned, and the lighting, and the camera angle. I got the picture I was after, and then I changed the image and asked DALL-E to add a black leather jacket to the model. Not only did it add the jacket to the picture, but it also took into consideration the volume, the lighting, and just placed the jacket on the model in a realistic way!

From left to right: initial image, editing and final result. Face fixed using tencentarc/gfpgan.

DALL-E can not only do photography, but pixel art, 3D models, games, and illustrations too. Of course, photography is the most impressive, but you can also do things like hand-drawn illustrations. It’s something else.

People are using DALL-E to create games because creating 3D models requires quite a lot of work. Sometimes you may need to create textures. Let’s say you are creating a castle and there’s some pattern in the walls which is sometimes made of rock or wood. It’s a difficult task for a digital artist to create a lot of variations for these textures. Today, you can give it to DALL-E and it will create the textures and variations for you. You can then use it in a 3D model and in a few minutes, you have got everything ready to go. You can build games fast using tools like DALL-E.

Darek: That sounds amazing.

Gerard: Yeah, there are some fancy use cases for it. For example, I can see that you have a sofa in the background. So, you could take a picture of a room and ask DALL-E to add furniture. So, imagine something is missing, maybe you want to add some artwork. You can give this task to DALL-E and it will just decorate your room in different styles if you like.

People are also using DALL-E to recreate historic scenes. For example, you could ask it to give you a selfie of a soldier in Normandy during World War II. It’s anything that you can imagine. DALL-E can use all the information that you give it. It’s more like having a creative director that puts all the information together and makes it work.

Of course, the technology is very new. We are talking about a technology that didn’t exist last year. So, in just one year it has made a lot of progress. But we will see many other things happening with all of these different products. DALL-E is probably the most surprising because it’s very visual.

Darek: can you use OpenAI today? How would that work?

Gerard: the OpenAI API is available to anyone since 2021 by creating an account and buying some credits. You just need to get your API key, like you do with similar services, and you are ready to go. For Codex API, you will need to apply for beta access. So far there’s no API for DALL-E but you can use all of the models using the OpenAI Playground.

Available APIs from OpenAI.

Darek: what are the benefits of using those tools? Are there any drawbacks?

Gerard: I think one of the benefits of GPT-3 is that it can boost your productivity. So if you need to create a technical blog post, for example, GPT-3 can help you do that in a fraction of the time it would normally take. With other products, it may depend on what you want to use them for. Sometimes the quality isn’t quite there yet, but you can use parts of the results of the outputs to speed up your project. So it can just depend on what exactly you want to use it for, but GPT-3 can help you be more productive.

As a drawback, I would say that GPT-3’s reliance on statistical methods means that there is no easy way to know if the information it gives you is factual. So if you are writing a blog post and some of the information is not factual, it can be a problem. Depending on how much you rely on AI, you will need to review it. You cannot just take the output from GPT-3 and use it in a publication and release it without checking the facts as you may be embarrassed later on.

It just depends on what you’re trying to do. Sometimes when you try to do something complex, there can be some issues with the technology. For example, in DALL-E, when it creates an image, human faces in the pictures don’t always look perfect. But I think these errors or these drawbacks are going to be fixed and then it’s going to be much, much faster to use these tools.

Darek: what the future of AI may look like in five years?

Gerard: I just did a talk around this topic where I described the AI Revolution. This is from the work of different authors.

The first stage is called the Cognitive Revolution. This is where we use AI Assistants to get our job done faster. And depending on what you do, that will work best or not. I mean, even for GitHub Copilot, it’s not that the tool is creating everything for you. It’s just giving you some kind of productivity boost.

The next step, which is a little bit more far-fetched and it’s probably not going to be that soon, is the Robotics Revolution. This means fully autonomous robots. I think I saw the first robot in Silicon Valley some years ago. They go around a complex and they can give you information about what you can visit. But of course, this is not something that you can find everywhere yet, but in the future, we will see more examples of these.

The third stage is when AIs get smarter than humans which is called Singularity. At this point, the AIs will be similar to the character in Star Trek, Data. This is an Android that helps Star Trek characters when they struggle or they go on adventures. If there’s some danger and they need to make a decision, they ask Data.

People are not certain when all of these different stages will happen but there’s a recent quote from Elon Musk where he said that we are going to be overtaken by AIs in the next five years.

Darek: do you think AI can help us build something we can’t do ourselves for now, like a super engine for a rocket?

Gerard: yes. What is happening is that with these models we have created some kind of super intelligence. This means that they have all the information accessible at the same time. So one of these technologies that are being used behind the scenes is neural networks. This allows for a faster response time, as they don’t need to calculate. The neural network is more like a human in that it doesn’t even think, it just gives you the answer right away. When you use the systems GPT-3 and Codex, it is not taking seconds or even milliseconds to answer back. That means that all of the information is accessible to them right now.

Darek: so when you ask it for example how to build a rocket engine, it will give you the answer right away?

Gerard: exactly. The moment you ask, it will give you the answer right away. It doesn’t need to think about it for a while and give you an answer next week. No. It gives you the answer right away.

Darek: what’s your conclusion?

Gerard: my conclusion is that nobody knows what is happening with these models but it’s exciting because people find new use cases and they find new things every day that you can use these models for. There has also been news about open-sourcing some of these technologies so you can use them in your projects which is also very exciting.

Darek: can you give some advice to people who are starting their career in IT today? And for those who are old-timers?

Gerard: I think a good piece of advice is to keep your mind open because everything is changing. People think that things don’t change, but in the area of AI, things are changing fast. AI was not a technology that has been changing a lot. It’s kind of an old area, but the discoveries made in the last few years have exploded. It’s more like JavaScript. You know the new JavaScript that happened a few years back. And people were surprised like, “Oh, can you do this with JavaScript?” I mean before we didn’t use JavaScript in this way. So we are facing now a new generation of AI and this will change everything. A lot of the jobs that we do today won’t exist in the future. So if you’re not retiring, I mean if you’re an old-timer you maybe still have five, ten years, or 15 years more in your career, I think you should look into what’s happening in blockchain, what’s happening in web3. Keep your mind open, because you can become obsolete very fast.

New generations, I think should also be careful about what they focus on. If you focus on learning a specialised skill and then realise that anyone can be productive in that technology because of AI assistants, you will have wasted your time. Why would you want to memorise things or understand things when an AI can help you? So then it’s not clear that you want to go through specialisation of 15 years and then find out that someone who doesn’t even have a specialisation is using a tool that helps them achieve the same results in just five years. So then you have lost 10 years learning something that wasn’t necessary. So that’s something to be careful about.

Darek: thank you, Gerard, for joining today’s podcast and sharing your knowledge with us. I look forward to meeting you in Warsaw soon.

Gerard: thanks a lot for inviting me. I’m looking forward to another edition of the conference. Thank you so much. Bye-bye, everyone!

Thanks for reading!

Have you got any questions? Feel free to leave your comments below or reach out on Twitter at gerardsans.

Resources

You can also listen to the original podcast using the links below.

Spotify: https://spoti.fi/3CHg2Zb

Google: https://bit.ly/JSMP-5

Apple: https://apple.co/3CdOJUY

Anchor: https://bit.ly/JSMP-05

YouTube: https://bit.ly/JSMP-GS-5

--

--

Gerard Sans

Helping Devs to succeed #AI #web3 / ex @AWSCloud / Just be AWSome / MC Speaker Trainer Community Leader @web3_london / @ReactEurope @ReactiveConf @ngcruise