Gen-3 Alpha: Explore the Latest in AI Video Generation

Plus: news from Leonardo.ai, YouTube Music, and Google

In partnership with

In this newsletter:

In last week’s poll, most of you voted for this newsletter to focus on AI art guides and tutorials of all sorts. Following your wishes, I have a new tutorial for you this week. I've decided to explore how to use the latest Gen-3 Alpha video generator by Runway most effectively, as this is currently the hottest topic in the AI art space. It appears that Gen-3 Alpha is the most advanced text-to-video AI generator available to the public, so let’s dive in and experiment with it a bit.

Before we begin, I want to introduce the first sponsor of this newsletter, The Rundown. I believe it’s the best newsletter for staying updated on AI news. I read it daily to stay in the loop.

Learn AI in 5 minutes a day.

The Rundown is the world’s largest AI newsletter, read by over 600,000 professionals from companies like Apple, OpenAI, NASA, Tesla, and more.

Their expert research team spends all day learning what’s new in AI and gives you ‘the rundown’ of the most important developments in one free email every morning.

The result? Readers not only keep up with the insane pace of AI but also learn why it actually matters.

🗣 Gen-3 Alpha by Runway

To get access to Gen-3 Alpha, you need to be on a paid plan. The Standard plan will cost you $15 per month if billed monthly or $12 per month if billed annually. You’ll get 625 monthly credits, the ability to generate 10-sec videos with Gen-3 Alpha, watermark removal option, enhanced video export resolution, and other bonuses.

A 5-second video created with Gen-3 will cost you 50 credits, while a 10-second video will deduct 100 credits from your balance. With the credits from the Standard Plan, you’ll be able to generate only 6 to 12 videos a month. If you need to create more videos, you can purchase 1,000 credits (equivalent to 100 seconds of Gen-3 videos) for $10, or upgrade to the Unlimited Plan for $95 a month if billed monthly.

How to Use

Generating videos with Runway is straightforward and doesn’t require any advanced technical skills.

In your Dashboard, select the "Text/Image to Video" tool, and you’ll see a window like the one below.

Here, you can choose the model to use (Gen-3 Alpha or Gen-2), write your prompt, select the video duration (5 seconds or 10 seconds), adjust settings (e.g., remove watermark), and apply any custom presets you have created. Then, simply press Generate and wait for your video to be created. It may take a few minutes.

Prompt: Wide angle establishing shot: An ocean wave crashes near a bustling megapolis city. The city skyline, with its tall skyscrapers and lights reflecting on the water, is visible in the background. The silhouette of the city against a vibrant sunset creates a warm, diffused lighting effect on the wave. Focus on the wave’s motion and the contrasting city skyline.

I find the video to be quite good, and it adheres quite precisely to the prompt. However, this example was probably not very challenging. We'll encounter more challenging examples later in the article.

You may have noticed that I used a rather long prompt with a specific structure. When crafting this prompt, I followed the tips provided by the Runway team and other sources. Let's delve into these prompt techniques.

Prompt Tips

Runway has published a Gen-3 Alpha Prompting Guide, offering excellent tips on writing effective prompts for this tool. Here are a few key points to consider if you want to get the most out of Gen-3 Alpha:

  • Prompts are most effective when they follow a clear structure, dividing details about the scene, subject, and camera movement into separate sections, like for example [camera movement]: [establishing scene]. [additional details].

  • Repeating or emphasizing key ideas in different sections of your prompt can help improve the accuracy of the output.

  • Using keywords can help achieve specific styles, but ensure they are cohesive with your overall prompt. The keywords you may use can relate to:

    • Camera styles: low angle, high angle, overhead, FPV, handheld, wide angle, close-up, macro cinematography, over the shoulder.

    • Lighting styles: diffused lighting, silhouette, backlit.

    • Movement speeds: dynamic motion, slow motion, fast motion, timelapse.

    • Movement types: grows, emerges, explodes, ascends, unfolds, transforms.

    • Style & aesthetic: moody, cinematic, home video VHS.

    • Text styles: graffiti, neon, embroidery.

  • If having difficulties formulating the prompt, consider using tools like ChatGPT or Claude to suggest text-to-video prompts for you based on the above recommendations.

Examples Across Different Topics and Features

No matter how well-crafted your prompt is, it’s important to understand that current text-to-video tools still have many limitations. While they excel at generating colorful abstract videos, they often fall short when attempting to create realistic videos with specific actions. Let’s now explore Gen-3’s strengths and weaknesses with a few examples.

Abstract videos

Runway can create impressive abstract videos with a bright color palette. In such videos, you won’t notice any incorrect object interactions or unwanted artifacts, because in abstract videos, everything is possible.

Prompt: Dynamic motion and close-up: A swirling mix of vibrant colors and shapes morphing and evolving continuously. Patterns and textures intertwine, creating a mesmerizing dance of abstract forms. Diffused lighting enhances the fluidity and depth of the visuals. Focus on the interplay of colors and shapes, emphasizing the constant transformation and movement.

Timelapse videos

The tool usually creates fairly decent timelapse videos, but you may still notice some unwanted artifacts and unrealistic motion patterns. For example, in the video below, buildings magically change when night falls instead of gradually transitioning from day to night.

Prompt: Timelapse, street camera perspective: A bustling city street with pedestrians and vehicles moving swiftly through the scene. Skyscrapers and city lights create a vibrant urban backdrop. As the video progresses, the sky changes from day to night, highlighting the dynamic flow of city life. Emphasize the continuous movement and energy of the city.

People & specific actions

The Gen-3 video generator still has significant issues with depicting people, especially when they are performing specific actions or when their hands are in the frame.

Prompt: Over-the-shoulder shot, slow motion: A professional chef expertly preparing a gourmet meal in a modern kitchen. The chef, dressed in a crisp white uniform and hat, chops vegetables with precision, then sautéing them in a sizzling pan. The kitchen is well-lit with natural light streaming through large windows, creating a warm and inviting atmosphere. Focus on the chef's skilled hands and the vibrant colors of the fresh ingredients.

As you can see here, the video starts off well, but then the chef uses the knife to stir the food (which doesn’t make much sense), and at one point, vegetables magically appear from his hand.

Text in videos

The new video generator by Runway can also create text in videos. Not surprisingly, it usually works better with simple, short words, while generating a multi-word, complex phrase is quite challenging.

Prompt: Wide angle: A clear blue sky with fluffy white clouds drifting by. Gradually, the clouds start to merge and reshape, forming the word "Hello" in a whimsical, flowing script. The scene is brightly lit by the sun, creating a serene and magical atmosphere. Focus on the transformation of the clouds as they spell out the word.

Prompt: Close-up, dynamic motion: A pristine beach with gentle waves lapping at the shore. Gradually, the sand starts to shift and move, forming the words "Happy Anniversary!" in elegant script. The scene is warmly lit by the setting sun, casting a golden glow over the beach. Focus on the transformation of the sand as it shapes the message, emphasizing the natural beauty of the beach and the celebratory text.

Additionally, from the Gen-3 videos posted on X, I’ve noticed that the tool usually doesn’t have issues generating the word "RUNWAY" in the videos, which makes sense. 😆

Prompt: Wide angle: A clear blue sky with fluffy white clouds drifting by. Gradually, the clouds start to merge and reshape, forming the word "RUNWAY" in a whimsical, flowing script. The scene is brightly lit by the sun, creating a serene and magical atmosphere. Focus on the transformation of the clouds as they spell out the word.

The text looks very nice, but it’s not made out of clouds, as requested.

Conclusion: Another Step in the Right Direction

The Gen-3 tool represents a significant advancement compared to previous versions, showcasing impressive capabilities in AI video generation. However, there are still notable limitations, particularly with rendering people performing complex actions. Despite these challenges, it is exciting to anticipate how the AI video generation space will evolve over the next 1-2 years, potentially overcoming these hurdles and unlocking even more creative possibilities.

🗞 News and Top Reads

  • Leonardo.ai is preparing to launch Phoenix, its new foundational model designed to offer enhanced control by "adhering to prompts with exceptional accuracy."

    • Phoenix marks Leonardo.Ai's inaugural in-house developed model. Previously, Leonardo utilized open-source models from Stable Diffusion.

    • The model is currently in preview stages, with an official launch scheduled for later this month.

  • YouTube Music is introducing an AI-powered sound search feature:

    • Users can tap the magnifying glass icon located in the top-right corner of YouTube Music to access a waveform button. This launches a fullscreen UI where you can sing, hum, or play a song.

    • The results page displays cover art, song name, artist, album, year, and download status, along with options to Play or Save to your library.

    • YouTube Music is also testing an "AI-generated conversational radio" in the US for Premium users. This feature allows users to create a custom radio station by describing precisely what they want to hear.

  • Google has introduced an AI video generator Google Vids to Workspace Labs.

    • Vids is an AI-powered video creation application tailored for professional use and seamlessly integrated with the Workspace suite.

    • Users can leverage high-quality templates or utilize Gemini in Vids to expedite initial drafts through AI generation.

📌 AI Art Video Tutorial: Color-grading Tool

In this video, Nolan from Future Tech Pilot provides a walkthrough of Color.io, a new AI-powered color-grading tool. The tool is remarkable for its quality, performance, flexibility, speed, user-friendly interface, and it even offers a free tier! So, if you are a photographer, designer, or artist, you should definitely check it out. Neither Nolan nor I were sponsored by Color.io😃

Carol is a digital creator, harnessing the power of AI and other conventional digital mediums to craft art that is both captivating and uplifting. Her passion lies in creating colorful, cheerful, and beautiful pieces, often featuring whimsical doodles and fantastical creatures. As an AI Doodler, she is constantly exploring new techniques and technologies to push the boundaries of what's possible.

🖼 AI-Assisted Artwork of the Week

If you read this newsletter issue online and like it, feel free to subscribe to get the latest AI art updates delivered to your email a few times a month.

Share Kiki and Mozart

If you enjoy this newsletter and know someone who might also appreciate it, please feel free to share it with them. Let's spread the word about AI art and introduce more people to this fascinating field!

Reply

or to participate.