Generative AI, what is it all about?
ChatGPT, GPT3, GPT4, Dalle-2, LaMDa, Bard, Ernie AI, Hugging Face Bloom, Wisper, Craiyon, Midjourney, stable diffusion. These are just a few of the most popular terms often heard when the topic of generative AI is being discussed. But where to begin, how to make sense of the rapid changes occuring in this field? This page will give an introduction to this topic with up to date information.
Recently on the web:
25-March 2023: Sam Altman interview.
21-March 2023: Google’s Bard open access.
17-March 2023: An early look at the labor market impact potential of large language models
16-March 2023: Baidu unveils their ERNIE AI
16-March 2023: Introducing Microsoft 365 Copilot
History of generative AI
Probabilistic models, such as Hidden Markov Models (HMMs) and Bayesian networks, were used to generate simple language and music. These models were based on rules and probabilities, and could generate data that was similar to a given dataset.
1980s
1990s
Recurrent Neural Networks (RNNs) were introduced as a powerful tool for generating sequences of data, such as text, speech, and music. RNNs were able to learn patterns in data and generate new data that was similar in style and content to the input data.
Generative Adversarial Networks (GANs) were introduced, which consisted of two neural networks: a generator and a discriminator. The generator created new data samples that were similar to the original dataset, while the discriminator tried to distinguish between the generated data and the original data. Through training, the generator became better at creating data that could fool the discriminator.
2000s
2010s
Deep Learning techniques, such as Convolutional Neural Networks (CNNs), were used to improve the performance of GANs for image and video generation. Variational Autoencoders (VAEs) were also introduced as an alternative generative model to GANs, which used an encoder-decoder architecture to learn a low-dimensional representation of the input data, which could then be used to generate new samples.
Generative AI models have continued to improve in performance and diversity, with new models such as Normalizing Flow models and Transformer-based language models being developed. These models are being used in a variety of applications, such as image and video generation, natural language processing, and drug discovery. The field is still rapidly evolving, with new models and techniques being developed to push the boundaries of generative AI.
2020s
The Future
The Main Players
What follows is an overview of the current applications relevant for each medium. Note that this figure can become out of date fast as innovation is occuring so rapidly. GPT4 has come out and replace GPT3, Google’s Bard (ChatGPT competitor) has come out and their chatbots are quickly being integrated in their software solutions such as Azure and Google docs.
OpenAI
OpenAI is an AI research organization founded in 2015 by a group of industry leaders, including Elon Musk and Sam Altman. OpenAI’s mission is to create safe and beneficial AI that can help humanity as a whole. What follows are a list of the most popular current (march 2023) products of OpenAI:
- GPT (Generative Pre-trained Transformer): A family of language models capable of generating human-like text based on a given input or prompt.
- DALL-E: An image generator that can create images from textual descriptions using a combination of language and computer vision models.
- Microscope: A tool for visualizing and understanding the behavior of neural networks by visualizing activations and gradients of individual neurons.
- GPT-Neo: A community-driven project that has developed a large-scale transformer-based language model trained on a diverse range of internet text.
- Codex: An AI-powered code autocompletion tool that can generate code snippets based on natural language descriptions.
Google’s generative AI projects span a range of applications, including music, art, and interactive experiences, and demonstrate the potential of AI to generate novel and creative content.
- Lambda: A conversational language model developed by Google that can engage in multi-turn conversations and maintain a consistent persona and context over multiple interactions.
- BARD: A generative AI tool developed by Google that can automatically route circuit boards using a deep neural network to learn from examples of previously routed boards and optimize for performance metrics.
- AutoDraw: An AI-powered drawing tool that can recognize and suggest drawings based on user input.
- WaveNet: A generative model for speech and audio synthesis that uses a deep neural network to generate high-quality audio waveforms.
Facebook’s generative AI projects demonstrate the potential of AI to generate new and creative content, and to enhance the capabilities of existing applications such as image and video editing, conversational AI, and text analysis.
- BigGAN: A large-scale generative model for high-quality image synthesis, capable of generating realistic images of various categories such as animals, objects, and scenes.
- StyleGAN: A generative model for high-resolution image synthesis, capable of generating high-quality and diverse images with controllable style and appearance.
- Vid2Vid: A video-to-video synthesis model that can generate realistic videos of objects and scenes based on a given input video.
- BlenderBot: A conversational AI system that can engage in chit-chat conversations and answer a wide range of questions on various topics, trained on a diverse range of conversational data.
- Inpainting: An AI tool for image inpainting, which can fill in missing parts of an image by generating plausible pixels based on the surrounding context.
Outside the USA
Although US companies seem most ahead regarding generative AI, there are various promising projects across the globe.
- Baidu’s model is based on its large-language model, ERNIE, introduced in 2019 and named for the Muppets character in a cheeky riposte to Google’s own large-language model, BERT, introduced the same year.
- Alibaba’s M6 has been optimized for Chinese NLP tasks. It performs well in tasks like text classification, sentiment analysis, and question answering.
- Tencent’s Hunyuan model is designed to provide high-quality machine translation for Chinese-English and English-Chinese language pairs. The model has been trained on a massive parallel corpus, and it focuses on improving translation accuracy and fluency.
- Tsinghua University’s Knowledge Engineering Group has open-sourced its GLM-130B project, a pre-trained Chinese and English large language model with high accuracy on downstream tasks.
Dangers of Generative AI
With the fast innovations there is noone with a clear view of what is and what is not possible. Science Fictions writers have explored this area since the 1980s and at this point we do not know what will be fiction and what will be science. Here are some of the biggest known dangers of continued development of AI. Ofcourse the biggest dangers are the unknowns.
- Misuse: One of the biggest worries is that generative AI could be used for malicious purposes, such as generating fake news, impersonating people, creating deepfakes, or even launching cyberattacks.
- Bias and unfairness: Another concern is that generative AI models may be biased or unfair, as they are trained on biased datasets or may perpetuate existing societal biases.
- Privacy: Generative AI models can generate highly personal information about individuals, such as biometric data, which raises concerns about privacy and data protection.
- Regulation: The rapid development of generative AI has raised questions about how to regulate and oversee its use to ensure it is safe, ethical, and transparent.
- Unintended consequences: Generative AI models are complex and may generate unintended consequences, such as unintended behaviors or outcomes, which may have negative impacts on society.
- Ethical concerns: The use of generative AI raises ethical concerns, such as how to ensure that it is used in a way that benefits society and respects individual rights and freedoms.
What will the future bring?
- Technological Utopia: In this future, advances in technology, including generative AI, lead to a world of abundance, where human needs are met and people have more time for creative and fulfilling pursuits. Generative AI models are used to solve pressing societal problems, such as climate change, and to create new forms of art and expression.
- Dystopian Surveillance State: In this future, generative AI is used to create highly realistic deepfakes and to monitor and control people’s behavior. Governments and corporations use these technologies to maintain power and control over society, leading to a loss of privacy and individual freedom.
- Post-Work Society: In this future, advances in technology, including generative AI, lead to widespread automation and the displacement of many jobs. However, society adapts to this shift, and people find new ways to organize their lives around non-work pursuits, such as art, education, and leisure.
- Technological Stagnation: In this future, advances in technology, including generative AI, slow down due to a lack of investment, regulation, or breakthroughs. As a result, progress in other areas, such as sustainability and social justice, is also slowed, leading to a stagnation of society as a whole.