All posts by zapier

DALL·E Now Available Without Waitlist

DALL·E Now Available Without Waitlist

New users can start creating straight away. Lessons learned from deployment and improvements to our safety systems make wider availability possible.

Starting later today, we are removing the waitlist for the DALL·E beta so users can sign up and start using it immediately. More than 1.5M users are now actively creating over 2M images a day with DALL·E—from artists and creative directors to authors and architects—with over 100K users sharing their creations and feedback in our Discord community.

Responsibly scaling a system as powerful and complex as DALL·E—while learning about all the creative ways it can be used and misused—has required an iterative deployment approach.

Since we first previewed the DALL·E research to users in April, users have helped us discover new uses for DALL·E as a powerful creative tool. Artists, in particular, have provided important input on DALL·E’s features.

DALL·E Now Available Without Waitlist
”Cyberpunk cat, 90s Japan anime style“ by OpenAI

DALL·E Now Available Without Waitlist
”Wildflowers, grassy field, autumn rhythm, watercolor“ by OpenAI

DALL·E Now Available Without Waitlist
”Running at the edge of space, toward a planet, calm, reaching the abyss, digital art“ by OpenAI

Their feedback inspired us to build features like Outpainting, which lets users continue an image beyond its original borders and create bigger images of any size, and collections—so users can create in all new ways and expedite their creative processes.

Learning from real-world use has allowed us to improve our safety systems, making wider availability possible today. In the past months, we’ve made our filters more robust at rejecting attempts to generate sexual, violent and other content that violates our content policy and built new detection and response techniques to stop misuse.

We are currently testing a DALL·E API with several customers and are excited to soon offer it more broadly to developers and businesses so they can build apps on this powerful system.

We can't wait to see what users from around the world create with DALL·E. Sign up today and start creating.


source https://openai.com/blog/dall-e-now-available-without-waitlist/

Ethical and Trustworthy AI: Lessons from the Front Lines [Video]

At the Marketing AI Institute, we’ve had webinars, blog posts, MAICON sessions, AI Academy for Marketers lessons, and more on ethical AI. But are the right questions being asked? Is the time and care necessary really being put into the implementation of AI-powered technologies? We’ve talked about it and read about it in academics, blog posts, and content, but when it comes to the practical side of things, what we need to help us are examples of ethical and trustworthy AI implementation.

from Marketing AI Institute | Blog https://ift.tt/ImJ3wHK
via IFTTT

Introducing Whisper

Introducing Whisper

We’ve trained and are open-sourcing a neural net called Whisper that approaches human level robustness and accuracy on English speech recognition.

Read Paper


View Code


View Model Card

Whisper examples:

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as well as translation from those languages into English. We are open-sourcing models and inference code to serve as a foundation for building useful applications and for further research on robust speech processing.

Introducing Whisper
Introducing Whisper

The Whisper architecture is a simple end-to-end approach, implemented as an encoder-decoder Transformer. Input audio is split into 30-second chunks, converted into a log-Mel spectrogram, and then passed into an encoder. A decoder is trained to predict the corresponding text caption, intermixed with special tokens that direct the single model to perform tasks such as language identification, phrase-level timestamps, multilingual speech transcription, and to-English speech translation.

Introducing Whisper
Introducing Whisper

Other existing approaches frequently use smaller, more closely paired audio-text training datasets, or use broad but unsupervised audio pretraining. Because Whisper was trained on a large and diverse dataset and was not fine-tuned to any specific one, it does not beat models that specialize in LibriSpeech performance, a famously competitive benchmark in speech recognition. However, when we measure Whisper’s zero-shot performance across many diverse datasets we find it is much more robust and makes 50% fewer errors than those models.

About a third of Whisper’s audio dataset is non-English, and it is alternately given the task of transcribing in the original language or translating to English. We find this approach is particularly effective at learning speech to text translation and outperforms the supervised SOTA on CoVoST2 to English translation zero-shot.

Introducing Whisper
Introducing Whisper

We hope Whisper’s high accuracy and ease of use will allow developers to add voice interfaces to a much wider set of applications. Check out the paper, model card, and code to learn more details and to try out Whisper.


References
  1. Chan, W., Park, D., Lee, C., Zhang, Y., Le, Q., and Norouzi, M. SpeechStew: Simply mix all available speech recogni- tion data to train one large neural network. arXiv preprint arXiv:2104.02133, 2021.
  2. Galvez, D., Diamos, G., Torres, J. M. C., Achorn, K., Gopi, A., Kanter, D., Lam, M., Mazumder, M., and Reddi, V. J. The people’s speech: A large-scale diverse english speech recognition dataset for commercial usage. arXiv preprint arXiv:2111.09344, 2021.
  3. Chen, G., Chai, S., Wang, G., Du, J., Zhang, W.-Q., Weng, C., Su, D., Povey, D., Trmal, J., Zhang, J., et al. Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio. arXiv preprint arXiv:2106.06909, 2021.
  4. Baevski, A., Zhou, H., Mohamed, A., and Auli, M. wav2vec 2.0: A framework for self-supervised learning of speech representations. arXiv preprint arXiv:2006.11477, 2020.
  5. Baevski, A., Hsu, W.N., Conneau, A., and Auli, M. Unsu pervised speech recognition. Advances in Neural Information Processing Systems, 34:27826–27839, 2021.
  6. Zhang, Y., Park, D. S., Han, W., Qin, J., Gulati, A., Shor, J., Jansen, A., Xu, Y., Huang, Y., Wang, S., et al. BigSSL: Exploring the frontier of large-scale semi-supervised learning for automatic speech recognition. arXiv preprint arXiv:2109.13226, 2021.

source https://openai.com/blog/whisper/

Reach Your Ideal Customers in Unexpected Places with AI

Digital fatigue is real. We’re all experiencing it as marketers and as consumers. As brands work to gain share of voice in a fragmented digital marketplace, they’re looking in alternate and unexpected places to reach consumers. And it’s a great idea. With some semblance of normalcy returning to the world, consumer travel and activities on the rise, and most adults in need of a digital detox, we marketers need to look outside computers and phone screens to reach customers and prospects. 

from Marketing AI Institute | Blog https://ift.tt/QTBqLOn
via IFTTT

What Does the Future of AI Look Like? These AI Experts Will Tell You

In the last few decades, we’ve experienced a rapid evolution of AI. Since its humble beginnings in 1956, artificial intelligence has transformed from simple predictive models to powerful machines, fueled by deep learning.

Today, AI has become much more feasible for organizations, thanks to foundational models made available by tech giants like Google and Meta. The focus has shifted from gathering large amounts of data to using the right data in a responsible way.

So, what does the future of AI look like?

from Marketing AI Institute | Blog https://ift.tt/JtpRjzf
via IFTTT