Files

T

Tolga Cangöz 7071b7461b Errata: Fix typos & \s+$ (#9008 )

* Fix typos

* chore: Fix typos

* chore: Update README.md for promptdiffusion example

* Trim trailing white spaces

* Fix a typo

* update number

* chore: update number

* Trim trailing white space

* Update README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2024-08-02 21:24:25 -07:00

2.9 KiB

Raw Blame History

Flux

Flux is a series of text-to-image generation models based on diffusion transformers. To know more about Flux, check out the original blog post by the creators of Flux, Black Forest Labs.

Original model checkpoints for Flux can be found here. Original inference code can be found here.

Flux can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out this section for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to this blog post to learn more.

Flux comes in two variants:

Timestep-distilled (black-forest-labs/FLUX.1-schnell)
Guidance-distilled (black-forest-labs/FLUX.1-dev)

Both checkpoints have slightly difference usage which we detail below.

Timestep-distilled

max_sequence_length cannot be more than 256.
guidance_scale needs to be 0.
As this is a timestep-distilled model, it benefits from fewer sampling steps.

import torch
from diffusers import  FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-schnell", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "A cat holding a sign that says hello world"
out = pipe(
    prompt=prompt,
    guidance_scale=0.,
    height=768,
    width=1360,
    num_inference_steps=4,
    max_sequence_length=256,
).images[0]
out.save("image.png")

Guidance-distilled

The guidance-distilled variant takes about 50 sampling steps for good-quality generation.
It doesn't have any limitations around the max_sequence_length.

import torch
from diffusers import  FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.enable_model_cpu_offload()

prompt = "a tiny astronaut hatching from an egg on the moon"
out = pipe(
    prompt=prompt,
    guidance_scale=3.5,
    height=768,
    width=1360,
    num_inference_steps=50,
).images[0]
out.save("image.png")

FluxPipeline

autodoc FluxPipeline - all - call

2.9 KiB Raw Blame History

Flux

Timestep-distilled

Guidance-distilled

FluxPipeline

2.9 KiB

Raw Blame History