Welcome to this tutorial where we'll explore two distinct methods to add style to your videos. We have a quick method, usually rendering one second of video in just one minute (Method 1), and a slightly slower but highly reliable alternative (Method 2).
If your video has minimal motion, Method 1 is the ideal choice. On the other hand, if your content involves more dynamic movements, like dance videos, Method 2 is the way to go. For shorter videos, I suggest investing the time in Method 2 for optimal results.
To transform your video into an incredible piece of art we will use Stable Diffusion, Deforum and ControlNet. Assuming you already know how to install extensions in Stable Deforum we’ll speed through this process. For ControlNet we will try different models, i suggest downloading the following models:
Make sure to download this before getting started, all the links can be found below.
To initiate the transformation process, begin by obtaining an image or frame from your video. Various methods can be employed for this task; however, I recommend utilizing professional editing software such as Premiere Pro or DaVinci Resolve. Alternatively, you may opt to capture a screenshot using the built-in Snipping Tool on Windows.
Head to the img2img tab in Stable Diffusion and upload your image. Don’t forget to select a checkpoint for your style, I’ve chosen the ToonYou checkpoint.
Next, we'll be creating our prompts, write a prompt that describes your image. To speed up our image generation I’ve used the LCM lora, you can learn more about LCM lora’s in our detailed blog about LCM lora’s.
Now we need to adjust the img2img settings, this may vary based on the checkpoint and whether or not you’re using the LCM lora. I highly recommend that you do use the LCM lora as it speeds up the generation process a lot.
Now it’s time to use ControlNet. Open up your ControlNet tab and enable it.
Now adjust the following settings:
Now you’re all set to hit generate!
Open up your ControlNet tab and enable 2 ControlNet units. For the first Unit adjust the following settings:
For the second ControlNet unit use these settings:
If you’re satisfied with the outcome we can move on to the Deforum settings!
To make your life a little easier I’ve made 2 settings files that you can download for free.
Simply download the file and put it in your stable-diffusion-webui folder.
Copy the path of the file and paste it in the deforum settings file section and press “Load all settings”.
After successfully loading all the settings, there still are a few settings you need to change yourself, let’s run those down real quick.
In the run tab you might need to change the width and height to match your video’s aspect ratio. Here you also have the option to rename your output video, under Batch name.
In the keyframes tab you have the option to change the Cadence, if your video has a lot of movement i suggest keeping this at 1. You can also play around with the Strength Schedule, this is how much you want the previous frame to influence the next frame. For slower videos I recommend playing with this as it can improve consistency a lot. You can ignore the “Max frames” as this does not apply to this workflow.
Also quick note; if you have an AMD or Intel GPU, you want to disable “Use depth warping” in the Depth Warping & FOV sub-tab.
In the prompts tab you need to change the prompts to match the one we used in our testing, make sure to include the Lora and use the right formatting.
For the Init tab we need to locate the Video Init sub-tab and paste our video path here. On windows 10; Shift and right click on your video and press “Copy as path”. Then paste the path and remove the double quotes.
Set the ControlNet settings to match the ones we used in testing. We don’t need to input our video path here as it will copy over the same path used in the Init tab.
In the output tab make sure that the FPS matches with the video you’re using. To find the frames per second (FPS) of your original video, navigate to its properties, select details, and locate the FPS information displayed next to "Frame Rate".
Additionally you can add a soundtrack or keep the same audio as your original video under Add soundtrack.
Now you’re ready to generate!
Let's delve into a showcase of how these methods work across different settings. These examples highlight the transformative power of Stable Diffusion and Deforum, emphasizing the nuances in style, motion, and visual enhancements achievable through distinct approaches. Below are a few instances demonstrating the versatility and impact of each method.
Here is the same video using method 2.
For this example I've chosen a video with less movement, a slow camera pan. Because there is little motion I've set the Cadence to 2, doing this makes Deforum render the video twice as fast. I've also set the Strength Schedule to 0.65, this means each frame that Deforum renders takes in 65% of the previous frame as reference. This allows for more consistency in the overall video, making it look much smoother. I did not upscale or interpolate this video, this is straight out of Deforum.
In conclusion, this tutorial has equipped you to transform your videos into captivating pieces using Stable Diffusion and Deforum. Our streamlined installation process ensured you have the necessary tools, including Stable Diffusion, Deforum, and ControlNet, along with the required models and extensions. The testing phase in img2img guided you through selecting checkpoints, crafting prompts, and adjusting settings for a seamless transition from image to enhanced visual style. As you conclude, remember to experiment. Tweak settings, explore new prompts, and discover your unique style. With Stable Diffusion and Deforum, unleash your creativity and let your videos tell captivating stories. Happy creating!
It depends on your video content. Choose Method 1 for minimal motion videos and Method 2 for dynamic content like dance sequences. Method 2 is slightly slower but ideal for videos with more movement, ensuring optimal results for dynamic scenarios. For videos under 15 seconds opt for Method 2 using TemporalNet & SoftEdge for more consistency.
The LCM lora is recommended, especially for speeding up the image generation process. While you can proceed without it, using LCM lora, as suggested in the blog, speeds up the transformation and is advisable for optimal results. If you decide not to use it I recommend a minimum of 15 sampling steps and up the CFG Scale to around 7
Yes, for AMD or Intel GPUs, it's recommended to disable "Use depth warping" in the Depth Warping & FOV sub-tab under the Run tab in Deforum settings. This ensures compatibility and smooth processing.
Absolutely! While the blog provides specific settings, feel free to experiment and customize ControlNet settings based on your video's characteristics. Adjusting parameters like Control Type, Preprocessor, Model, Control Weight, and Control Mode allows for tailored results.
Ensure consistency between ControlNet and Deforum settings by matching prompts used in testing. The prompts under the Prompts tab in Deforum settings should align with those used in img2img testing with ControlNet, maintaining coherence throughout the video transformation process.