* init for korean docs
* edit build yml file for multi language docs
* edit one more build yml file for multi language docs
* add title for get_frontmatter error
* add translating.md
* default language for docs is en
* Update docs/TRANSLATING.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Repro] Correct reproducability
* up
* up
* uP
* up
* need better image
* allow conversion from no state dict checkpoints
* up
* up
* up
* up
* check tensors
* check tensors
* check tensors
* check tensors
* next try
* up
* up
* better name
* up
* up
* Apply suggestions from code review
* correct more
* up
* replace all torch randn
* fix
* correct
* correct
* finish
* fix more
* up
* init for korean docs
* edit build yml file for multi language docs
* edit one more build yml file for multi language docs
* add title for get_frontmatter error
* Various Fixes for Flax Dreambooth
- Correctly update the progress bar every epoch
- Allow specifying a pretrained VAE
- Allow specifying a revision to pretrained models
- Cache compiled models between invocations (speeds up TPU execution a lot!)
- Save intermediate checkpoints by specifying `save_steps`
* Don't die when save_steps is not set.
* Address CR
* Address comments
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Support training SD V2 with Flax
Mostly involves supporting a v_prediction scheduler.
The implementation in #1777 doesn't take into account a recent refactor of `scheduling_utils_flax`, so this should be used instead.
* Add to other top-level files.
* [Deterministic torch randn] Allow tensors to be generated on CPU
* fix more
* up
* fix more
* up
* Update src/diffusers/utils/torch_utils.py
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Apply suggestions from code review
* up
* up
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* misc fixes
* more comments
* Update examples/textual_inversion/textual_inversion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* set transformers verbosity to warning
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* allow using non-ema weights for training
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* address more review comment
* reorganise a few lines
* always pad text to max_length to match original training
* ifx collate_fn
* remove unused code
* don't prepare ema_unet, don't register lr scheduler
* style
* assert => ValueError
* add allow_tf32
* set log level
* fix comment
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* [Unclip] Make sure latents can be reused
* allow one to directly pass embeddings
* up
* make unclip for text work
* finish allowing to pass embeddings
* correct more
* make style
* move files a bit
* more refactors
* fix more
* more fixes
* fix more onnx
* make style
* upload
* fix
* up
* fix more
* up again
* up
* small fix
* Update src/diffusers/__init__.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* correct
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* initial
* type hints
* update scheduler type hint
* add to README
* add example generation to README
* v -> mix_factor
* load scheduler from pretrained
* Make xformers optional even if it is available
* Raise exception if xformers is used but not available
* Rename use_xformers to enable_xformers_memory_efficient_attention
* Add a note about xformers in README
* Reformat code style
* Make safety_checker optional in more pipelines.
* Remove inappropriate comment in inpaint pipeline.
* InPaint Test: set feature_extractor to None.
* Remove import
* img2img test: set feature_extractor to None.
* inpaint sd2 test: set feature_extractor to None.
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* first proposal
* rename
* up
* Apply suggestions from code review
* better
* up
* finish
* up
* rename
* correct versatile
* up
* up
* up
* up
* fix
* Apply suggestions from code review
* make style
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* add error message
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* use repeat_interleave
* fix repeat
* Trigger Build
* don't install accelerate from main
* install released accelrate for mps test
* Remove additional accelerate installation from main.
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Section header for in-painting, inference from checkpoint.
* Inference: link to section to perform inference from checkpoint.
* Move Dreambooth in-painting instructions to the proper place.
* [Flax] Stateless schedulers, fixes and refactors
* Remove scheduling_common_flax and some renames
* Update src/diffusers/schedulers/scheduling_pndm_flax.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* expose polynomial:power and cosine_with_restarts:num_cycles using get_scheduler func, add it to train_dreambooth.py
* fix formatting
* fix style
* Update src/diffusers/optimization.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Fail if there are less images than the effective batch size.
* Remove lr-scheduler arg as it's currently ignored.
* Make guidance_scale work for batch_size > 1.
* [Batched Generators] all batched generators
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* up
* hey
* up again
* fix tests
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* correct tests
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Fix links to flash attention.
* Add xformers installation instructions.
* Make link to xformers install more prominent.
* Link to xformers install from training docs.
* Add examples with Intel optimizations (BF16 fine-tuning and inference)
* Remove unused package
* Add README for intel_opts and refine the description for research projects
* Add notes of intel opts for diffusers
* update inpaint_legacy to allow the use of predicted noise to construct intermediate diffused images
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint_legacy.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add state checkpointing to other training scripts
* Fix first_epoch
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update Dreambooth checkpoint help message.
* Dreambooth docs: checkpoints, inference from a checkpoint.
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Remove bogus file
* [Docs] Remove mentioning of gated access since no longer exsits
* add docs to index
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Dreambooth: save / restore training state.
* make style
* Rename vars for clarity.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Remove unused import
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [SD] Make sure batched input works correctly
* uP
* uP
* up
* up
* uP
* up
* fix mask stuff
* up
* uP
* more up
* up
* uP
* up
* finish
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Added Community pipeline for comparing Stable Diffusion v1.1-4
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
* Made changes to provide support for current iteration of from_pretrained and added example
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
* updated a small spelling error
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
* added pipeline entry to table
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
* Initial code for attempt at improving SD <--> diffusers conversions for v2.0
* Updates to support round-trip between orig. SD 2.0 and diffusers models
* Corrected formatting to Black standard
* Correcting import formatting
* Fixed imports (properly this time)
* add some corrections
* remove inference files
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* dreambooth: fix#1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16
* dreambooth: guard against passing keep_fp32_wrapper arg to older versions of accelerate. part of fix for #1566
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update examples/dreambooth/train_dreambooth.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
easy fix for undefined name in train_dreambooth.py
import_model_class_from_model_name_or_path loads a pretrained model
and refers to args.revision in a context where args is undefined. I modified
the function to take revision as an argument and modified the invocation
of the function to pass in the revision from args. Seems like this was caused
by a cut and paste.
* Do not recompile when guidance_scale changes.
* Remove debug for simplicity.
* make style
* Make guidance_scale an array.
* Make DEBUG a constant to avoid passing it down.
* Add comments for clarification.
* add paint by example
* mkae loading possibel
* up
* Update src/diffusers/models/attention.py
* up
* finalize weight structure
* make example work
* make it work
* up
* up
* fix
* del
* add
* update
* Apply suggestions from code review
* correct transformer 2d
* finish
* up
* up
* up
* up
* fix
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Apply suggestions from code review
* up
* finish
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* add check_min_version for examples
* move __version__ to the top
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* fix comment
* fix error_message
* adapt the install message
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline
* add docs to toc
* fix tests
* fix tests
* fix tests
* fix tests
* fix tests
* Update pr_tests.yml
Fix tests
* parent 499ff34b3e
author teticio <teticio@gmail.com> 1668765652 +0000
committer teticio <teticio@gmail.com> 1669041721 +0000
parent 499ff34b3e
author teticio <teticio@gmail.com> 1668765652 +0000
committer teticio <teticio@gmail.com> 1669041704 +0000
add colab notebook
[Flax] Fix loading scheduler from subfolder (#1319)
[FLAX] Fix loading scheduler from subfolder
Fix/Enable all schedulers for in-painting (#1331)
* inpaint fix k lms
* onnox as well
* up
Correct path to schedlure (#1322)
* [Examples] Correct path
* uP
Avoid nested fix-copies (#1332)
* Avoid nested `# Copied from` statements during `make fix-copies`
* style
Fix img2img speed with LMS-Discrete Scheduler (#896)
Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the `integrate.quad` call later on- by long I mean more than 10x slower.
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Fix the order of casts for onnx inpainting (#1338)
Legacy Inpainting Pipeline for Onnx Models (#1237)
* Add legacy inpainting pipeline compatibility for onnx
* remove commented out line
* Add onnx legacy inpainting test
* Fix slow decorators
* pep8 styling
* isort styling
* dummy object
* ordering consistency
* style
* docstring styles
* Refactor common prompt encoding pattern
* Update tests to permanent repository home
* support all available schedulers until ONNX IO binding is available
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* updated styling from PR suggested feedback
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Jax infer support negative prompt (#1337)
* support negative prompts in sd jax pipeline
* pass batched neg_prompt
* only encode when negative prompt is None
Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Update README.md: Minor change to Imagic code snippet, missing dir error (#1347)
Minor change to Imagic Readme
Missing dir causes an error when running the example code.
make style
change the sample model (#1352)
* Update alt_diffusion.mdx
* Update alt_diffusion.mdx
Add bit diffusion [WIP] (#971)
* Create bit_diffusion.py
Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG
* adding bit diffusion to new branch
ran tests
* tests
* tests
* tests
* tests
* removed test folders + added to README
* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* move Mel to module in pipeline construction, make librosa optional
* fix imports
* fix copy & paste error in comment
* fix style
* add missing register_to_config
* fix class docstrings
* fix class docstrings
* tweak docstrings
* tweak docstrings
* update slow test
* put trailing commas back
* respect alphabetical order
* remove LatentAudioDiffusion, make vqvae optional
* move Mel from models back to pipelines :-)
* allow loading of pretrained audiodiffusion models
* fix tests
* fix dummies
* remove reference to latent_audio_diffusion in docs
* unused import
* inherit from SchedulerMixin to make loadable
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make attn slice recursive
* remove set_attention_slice from blocks
* fix copies
* make enable_attention_slicing base class method of DiffusionPipeline
* fix set_attention_slice
* fix set_attention_slice
* fix copies
* add tests
* up
* up
* up
* update
* up
* uP
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
The mask and instance image were being cropped in different ways without --center_crop, causing the model to learn to ignore the mask in some cases. This PR fixes that and generate more consistent results.
[textual_inversion] Add an option to only save embeddings
Add an command line option --only_save_embeds to the example script, for
not saving the full model. Then only the learned embeddings are saved,
which can be added to the original model at runtime in a similar way as
they are created in the training script.
Saving the full model is forced when --push_to_hub is used. (Implements #759)
* Add parameter safe_serialization to DiffusionPipeline.save_pretrained
* Add option safe_serialization on ModelMixin.save_pretrained
* Add test test_save_safe_serialization
* Black
* Re-trigger the CI
* Fix doc-builder
* Validate files are saved as safetensor in test_save_safe_serialization
- Add the missing `scale_model_input` method to `FlaxLMSDiscreteScheduler`
- Use `jnp.append` for appending to `state.derivatives`
- Use `jnp.delete` to pop from `state.derivatives`
* Create train_dreambooth_inpaint.py
train_dreambooth.py adapted to work with the inpaint model, generating random masks during the training
* Update train_dreambooth_inpaint.py
refactored train_dreambooth_inpaint with black
* Update train_dreambooth_inpaint.py
* Update train_dreambooth_inpaint.py
* Update train_dreambooth_inpaint.py
Fix prior preservation
* add instructions to readme, fix SD2 compatibility
* Do not use torch.long in mps
Addresses #1056.
* Use torch.int instead of float.
* Propagate changes.
* Do not silently change float -> int.
* Propagate changes.
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Moving the mem efficiient attention activation to the top + recursive
* black, too bad there's no pre-commit ?
Co-authored-by: Benjamin Lefaudeux <benjamin@photoroom.com>
* feat: switch core pipelines to use image arg
* test: update tests for core pipelines
* feat: switch examples to use image arg
* docs: update docs to use image arg
* style: format code using black and doc-builder
* fix: deprecate use of init_image in all pipelines
* Flax: start adapting to Stable Diffusion 2
* More changes.
* attention_head_dim can be a tuple.
* Fix typos
* Add simple SD 2 integration test.
Slice values taken from my Ampere GPU.
* Add simple UNet integration tests for Flax.
Note that the expected values are taken from the PyTorch results. This
ensures the Flax and PyTorch versions are not too far off.
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Typos and style
* Tests: verify jax is available.
* Style
* Make flake happy
* Remove typo.
* Simple Flax SD 2 pipeline tests.
* Import order
* Remove unused import.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: @camenduru
* Add heun
* Finish first version of heun
* remove bogus
* finish
* finish
* improve
* up
* up
* fix more
* change progress bar
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
* finish
* up
* up
* up
* [Proposal] Support loading from safetensors if file is present.
* Style.
* Fix.
* Adding some test to check loading logic.
+ modify download logic to not download pytorch file if not necessary.
* Fixing the logic.
* Adressing comments.
* factor out into a function.
* Remove dead function.
* Typo.
* Extra fetch only if safetensors is there.
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* StableDiffusionUpscalePipeline
* fix a few things
* make it better
* fix image batching
* run vae in fp32
* fix docstr
* resize to mul of 64
* doc
* remove safety_checker
* add max_noise_level
* fix Copied
* begin tests
* slow tests
* default max_noise_level
* remove kwargs
* doc
* fix
* fix fast tests
* fix fast tests
* no sf
* don't offload vae
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Adapt ddpm, ddpmsolver to prediction_type.
* Deprecate predict_epsilon in __init__.
* Bring FlaxDDIMScheduler up to date with DDIMScheduler.
* Set prediction_type as an ivar for consistency.
* Convert pipeline_ddpm
* Adapt tests.
* Adapt unconditional training script.
* Adapt BitDiffusion example.
* Add missing kwargs in dpmsolver_multistep
* Ugly workaround to accept deprecated predict_epsilon when loading
schedulers using from_pretrained.
* make style
* Remove import no longer in use.
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Use config.prediction_type everywhere
* Add a couple of Flax prediction type tests.
* make style
* fix register deprecated arg
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* convert dual unet
* revert dual attn
* adapt for vd-official
* test the full pipeline
* mixed inference
* mixed inference for text2img
* add image prompting
* fix clip norm
* split text2img and img2img
* fix format
* refactor text2img
* mega pipeline
* add optimus
* refactor image var
* wip text_unet
* text unet end to end
* update tests
* reshape
* fix image to text
* add some first docs
* dual guided pipeline
* fix token ratio
* propose change
* dual transformer as a native module
* DualTransformer(nn.Module)
* DualTransformer(nn.Module)
* correct unconditional image
* save-load with mega pipeline
* remove image to text
* up
* uP
* fix
* up
* final fix
* remove_unused_weights
* test updates
* save progress
* uP
* fix dual prompts
* some fixes
* finish
* style
* finish renaming
* up
* fix
* fix
* fix
* finish
Co-authored-by: anton-l <anton@huggingface.co>
* make sure fp16 runs well
* add fp16 test for superes
* Update src/diffusers/models/unet_2d.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* gen on cuda
* always run fast inferecne test on cpu
* run on cpu
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* fix non square images with UNet2DModel and DDIM/DDPM pipelines
* fix unet_2d `sample_size` docstring
* update pipeline tests for unet uncond
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Create bit_diffusion.py
Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG
* adding bit diffusion to new branch
ran tests
* tests
* tests
* tests
* tests
* removed test folders + added to README
* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Handle batches and Tensors in `prepare_mask_and_masked_image`
* `blackfy`
upgrade `black`
* handle mask as `np.array`
* add docstring
* revert `black` changes with smaller line length
* missing ValueError in docstring
* raise `TypeError` for image as tensor but not mask
* typo in mask shape selection
* check for batch dim
* fix: wrong indentation
* add tests
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* support negative prompts in sd jax pipeline
* pass batched neg_prompt
* only encode when negative prompt is None
Co-authored-by: Juan Acevedo <jfacevedo@google.com>
* Add legacy inpainting pipeline compatibility for onnx
* remove commented out line
* Add onnx legacy inpainting test
* Fix slow decorators
* pep8 styling
* isort styling
* dummy object
* ordering consistency
* style
* docstring styles
* Refactor common prompt encoding pattern
* Update tests to permanent repository home
* support all available schedulers until ONNX IO binding is available
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* updated styling from PR suggested feedback
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the `integrate.quad` call later on- by long I mean more than 10x slower.
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* being tests
* fix model ids
* don't use safety checker in tests
* add im2img2 tests
* fix integration tests
* integration tests
* style
* add sentencepiece in test dep
* quality
* 4 decimalk points
* fix im2img test
* increase the tok slightly
* add conversion script for vae
* up
* up
* some fixes
* add text model
* use the correct config
* add docs
* move model in it's own file
* move model in its own file
* pass attenion mask to text encoder
* pass attn mask to uncond inputs
* quality
* fix image2image
* add imag2image in init
* fix import
* fix one more import
* fix import, dummy objetcs
* fix copied from
* up
* finish
Co-authored-by: patil-suraj <surajp815@gmail.com>
* add conversion script for vae
* uP
* uP
* more changes
* push
* up
* finish again
* up
* up
* up
* up
* finish
* up
* uP
* up
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* up
* up
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* re-add RL model code
* match model forward api
* add register_to_config, pass training tests
* fix tests, update forward outputs
* remove unused code, some comments
* add to docs
* remove extra embedding code
* unify time embedding
* remove conv1d output sequential
* remove sequential from conv1dblock
* style and deleting duplicated code
* clean files
* remove unused variables
* clean variables
* add 1d resnet block structure for downsample
* rename as unet1d
* fix renaming
* rename files
* add get_block(...) api
* unify args for model1d like model2d
* minor cleaning
* fix docs
* improve 1d resnet blocks
* fix tests, remove permuts
* fix style
* add output activation
* rename flax blocks file
* Add Value Function and corresponding example script to Diffuser implementation (#884)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* update post merge of scripts
* add mdiblock / outblock architecture
* Pipeline cleanup (#947)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
* clean up comments
* convert older script to using pipeline and add readme
* rename scripts
* style, update tests
* delete unet rl model file
* remove imports in src
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* Update src/diffusers/models/unet_1d_blocks.py
* Update tests/test_models_unet.py
* RL Cleanup v2 (#965)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
* clean up comments
* convert older script to using pipeline and add readme
* rename scripts
* style, update tests
* delete unet rl model file
* remove imports in src
* add specific vf block and update tests
* style
* Update tests/test_models_unet.py
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* fix quality in tests
* fix quality style, split test file
* fix checks / tests
* make timesteps closer to main
* unify block API
* unify forward api
* delete lines in examples
* style
* examples style
* all tests pass
* make style
* make dance_diff test pass
* Refactoring RL PR (#1200)
* init file changes
* add import utils
* finish cleaning files, imports
* remove import flags
* clean examples
* fix imports, tests for merge
* update readmes
* hotfix for tests
* quality
* fix some tests
* change defaults
* more mps test fixes
* unet1d defaults
* do not default import experimental
* defaults for tests
* fix tests
* fix-copies
* fix
* changes per Patrik's comments (#1285)
* changes per Patrik's comments
* update conversion script
* fix renaming
* skip more mps tests
* last test fix
* Update examples/rl/README.md
Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>
* Add a reference to the name 'Sampler'
- Facilitate people that are familiar with the name samplers to understand that we call that schedulers
- Better SEO if people are googling for samplers to find our library as well
* Update README.md with a reference to 'Sampler'
* Match the generator device to the pipeline for DDPM and DDIM
* style
* fix
* update values
* fix fast tests
* trigger slow tests
* deprecate
* last value fixes
* mps fixes
Flax tests: don't hardcode number of devices.
This makes it possible to test on CPU/GPU. However, expected slices are
only checked when there are 8 devices.
* [Scheduler] Move predict epsilon to init
* up
* uP
* uP
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* up
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Make errors for invalid options without "--with_prior_preservation"
* Make --instance_prompt required
* Removed needless check because --instance_data_dir is marked with required
* Updated messages
* Use logger.warning instead of raise errors
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Schedulers: don't use float64 on mps
* Test set_timesteps() on device (float schedulers).
* SD pipeline: use device in set_timesteps.
* SD in-painting pipeline: use device in set_timesteps.
* Tests: fix mps crashes.
* Skip test_load_pipeline_from_git on mps.
Not compatible with float16.
* Use device.type instead of str in Euler schedulers.
* adds image to image inpainting with `PIL.Image.Image` inputs
the base implementation claims to support `torch.Tensor` but seems it
would also fail in this case.
* `make style` and `make quality`
* updates community examples readme
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add enable sequential cpu offloading to other stable diffusion pipelines
* trigger ci
* fix styling
* interpolate before converting to device to avoid breking when cpu_offload is enabled with fp16
Co-authored-by: Pedro Gengo <pedro.gabriel.lourenco@hotmail.com>
* style again I need to stop forgething this thing
* fix inpainting bug that could cause device misalignment
Co-authored-by: Pedro Gengo <pedro.gabriel.lourenco@hotmail.com>
* Apply suggestions from code review
Co-authored-by: Pedro Gengo <pedro.gabriel.lourenco@hotmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* increase the precision of slice-based tests and make the default test case easier to single out
* increase precision of unit tests which already rely on float comparisons
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make accelerate hard dep
* default fast init
* move params to cpu when device map is None
* handle device_map=None
* handle torch < 1.9
* remove device_map="auto"
* style
* add accelerate in torch extra
* remove accelerate from extras["test"]
* raise an error if torch is available but not accelerate
* update installation docs
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve defautl loading speed even further, allow disabling fats loading
* address review comments
* adapt the tests
* fix test_stable_diffusion_fast_load
* fix test_read_init
* temp fix for dummy checks
* Trigger Build
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Changes for VQ-diffusion VQVAE
Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.
Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.
* Changes for VQ-diffusion transformer
Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.
SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings
ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels
BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings
CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers
FeedForward:
- Internal layers are now configurable
ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer
AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings
* Add VQ-diffusion scheduler
* Add VQ-diffusion pipeline
* Add VQ-diffusion convert script to diffusers
* Add VQ-diffusion dummy objects
* Add VQ-diffusion markdown docs
* Add VQ-diffusion tests
* some renaming
* some fixes
* more renaming
* correct
* fix typo
* correct weights
* finalize
* fix tests
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* finish
* finish
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* feat: add repaint
* fix: fix quality check with `make fix-copies`
* fix: remove old unnecessary arg
* chore: change default to DDPM (looks better in experiments)
* ".to(device)" changed to "device="
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* make generator device-specific
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* make generator device-specific and change shape
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* fix: add preprocessing for image and mask
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* fix: update test
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update src/diffusers/pipelines/repaint/pipeline_repaint.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add docs and examples
* Fix toctree
Co-authored-by: fja <fja@zurich.ibm.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* changed training example to add option to train model that predicts x0 (instead of eps), changed DDPM pipeline accordingly
* Revert "changed training example to add option to train model that predicts x0 (instead of eps), changed DDPM pipeline accordingly"
This reverts commit c5efb52564.
* changed training example to add option to train model that predicts x0 (instead of eps), changed DDPM pipeline accordingly
* fixed code style
Co-authored-by: lukovnikov <lukovnikov@users.noreply.github.com>
* Fix equality test for ddim and ddpm
* add docs for use_clipped_model_output in DDIM
* fix inline comment
* reorder imports in test_pipelines.py
* Ignore use_clipped_model_output if scheduler doesn't take it
* improve test precision
get tests passing with greater precision using lewington images
* make old numpy load function a wrapper around a more flexible numpy loading function
* adhere to black formatting
* add more black formatting
* adhere to isort
* loosen precision and replace path
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* 2x speedup using memory efficient attention
* remove einops dependency
* Swap K, M in op instantiation
* Simplify code, remove unnecessary maybe_init call and function, remove unused self.scale parameter
* make xformers a soft dependency
* remove one-liner functions
* change one letter variable to appropriate names
* Remove Env variable dependency, remove MemoryEfficientCrossAttention class and use enable_xformers_memory_efficient_attention method
* Add memory efficient attention toggle to img2img and inpaint pipelines
* Clearer management of xformers' availability
* update optimizations markdown to add info about memory efficient attention
* add benchmarks for TITAN RTX
* More detailed explanation of how the mem eff benchmark were ran
* Removing autocast from optimization markdown
* import_utils: import torch only if is available
Co-authored-by: Nouamane Tazi <nouamane98@gmail.com>
* initial commit to add imagic to stable diffusion community pipelines
* remove some testing changes
* comments from PR review for imagic stable diffusion
* remove changes from pipeline_stable_diffusion as part of imagic pipeline
* clean up example code and add line back in to pipeline_stable_diffusion for imagic pipeline
* remove unused functions
* small code quality changes for imagic pipeline
* clean up readme
* remove hardcoded logging values for imagic community example
* undo change for DDIMScheduler
Remove some unused parameter
The `downsample_padding` parameter does not seem to be used in `CrossAttnUpBlock2D` (or by any up block for that matter) so removing it.
* [Better scheduler docs] Improve usage examples of schedulers
* finish
* fix warnings and add test
* finish
* more replacements
* adapt fast tests hf token
* correct more
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Integrate compatibility with euler
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Docs: refer to pre-RC version of PyTorch 1.13.0.
* Remove temporary workaround for unavailable op.
* Update comment to make it less ambiguous.
* Remove use of contiguous in mps.
It appears to not longer be necessary.
* Special case: use einsum for much better performance in mps
* Update mps docs.
* MPS: make pipeline work in half precision.
Tests: upgrade PyTorch cuda to 11.7.
Otherwise the cuda versions of torch and torchvision mismatch, and
examples tests fail. We were requesting cuda 11.6 for PyTorch, and the
default torchvision (via setup.py).
Another option would be to include torchvision in the same pip install
line as torch.
* Add failing test for #940.
* Do not use torch.float64 in mps.
* style
* Temporarily skip add_noise for IPNDMScheduler.
Until #990 is addressed.
* Fix additional float64 error in mps.
* Improve add_noise test
* Slight edit – I think it's clearer this way.
* add textual inversion flax
* make style
* make style
* replicate vae and unet params
* make style
* minor
* save after end of training
* style
* Temporary fix
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Add Flax instruction
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Make training code usable by external scripts
Add parameter inputs to training and argument parsing function to allow this script to be used by an external call.
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add method to enable cuda with minimal gpu usage to stable diffusion
* add test to minimal cuda memory usage
* ensure all models but unet are onn torch.float32
* move to cpu_offload along with minor internal changes to make it work
* make it test against accelerate master branch
* coming back, its official: I don't know how to make it test againt the master branch from accelerate
* make it install accelerate from master on tests
* go back to accelerate>=0.11
* undo prettier formatting on yml files
* undo prettier formatting on yml files againn
* start
* add more logic
* Update src/diffusers/models/unet_2d_condition_flax.py
* match weights
* up
* make model work
* making class more general, fixing missed file rename
* small fix
* make new conversion work
* up
* finalize conversion
* up
* first batch of variable renamings
* remove c and c_prev var names
* add mid and out block structure
* add pipeline
* up
* finish conversion
* finish
* upload
* more fixes
* Apply suggestions from code review
* add attr
* up
* uP
* up
* finish tests
* finish
* uP
* finish
* fix test
* up
* naming consistency in tests
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* remove hardcoded 16
* Remove bogus
* fix some stuff
* finish
* improve logging
* docs
* upload
Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Docs: refer to pre-RC version of PyTorch 1.13.0.
* Remove temporary workaround for unavailable op.
* Update comment to make it less ambiguous.
* Remove use of contiguous in mps.
It appears to not longer be necessary.
* Special case: use einsum for much better performance in mps
* Update mps docs.
* Minor doc update.
* Accept suggestion
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Update README.md
Additionally add FLAX so the model card can be slimmer and point to this page
* Find and replace all
* v-1-5 -> v1-5
* revert test changes
* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update docs/source/quicktour.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update README.md
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update docs/source/quicktour.mdx
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update README.md
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Revert certain references to v1-5
* Docs changes
* Apply suggestions from code review
Co-authored-by: apolinario <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: anton-l <anton@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Initial Wildcard Stable Diffusion Pipeline
* Added some additional example usage
* style
* Added links in README and additional documentation
* Initial Wildcard Stable Diffusion Pipeline
* Added some additional example usage
* style
* Added links in README and additional documentation
* cleanup readme again
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Support LMSDiscreteScheduler in LDMPipeline
This is a small change to support all schedulers such as LMSDiscreteScheduler in LDMPipeline.
What's changed
-------
* Add the `scale_model_input` function before `step` to ensure correct denoising (L77)
* Add "scale the initial noise by the standard deviation required by the scheduler"
* run `make style`
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* First draft
* created the SpeechToImagePipeline class
* Corrected speech_to_image_diffusion.py style
* Added safety checker
* Corrected style
* Adding examples to README
* begin pipe
* add new pipeline
* add tests
* correct fast test
* up
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py
* Update tests/test_pipelines.py
* up
* up
* make style
* add fp16 test
* doc, comments
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* [CI] Add Apple M1 tests
* setup-python
* python build
* conda install
* remove branch
* only 3.8 is built for osx-arm
* try fetching prebuilt tokenizers
* use user cache
* update shells
* Reports and cleanup
* -> MPS
* Disable parallel tests
* Better naming
* investigate worker crash
* return xdist
* restart
* num_workers=2
* still crashing?
* faulthandler for segfaults
* faulthandler for segfaults
* remove restarts, stop on segfault
* torch version
* change installation order
* Use pre-RC version of PyTorch.
To be updated when it is released.
* Skip crashing test on MPS, add new one that works.
* Skip cuda tests in mps device.
* Actually use generator in test.
I think this was a typo.
* make style
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Bump to 0.6.0.dev0
* Deprecate tensor_format and .samples
* style
* upd
* upd
* style
* sample -> images
* Update src/diffusers/schedulers/scheduling_ddpm.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_ddim.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_karras_ve.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_lms_discrete.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_pndm.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_sde_ve.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/schedulers/scheduling_sde_vp.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Remove set_format in Flax pipeline.
* Remove DummyChecker.
* Run safety_checker in pipeline.
* Don't pmap on every call.
We could have decorated `generate` with `pmap`, but I wanted to keep it
in case someone wants to invoke it in non-parallel mode.
* Remove commented line
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Replicate outside __call__, prepare for optional jitting.
* Remove unnecessary clipping.
As suggested by @kashif.
* Do not jit unless requested.
* Send all args to generate.
* make style
* Remove unused imports.
* Fix docstring.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Give more customizable options for safety checker
* Apply suggestions from code review
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
* Finish
* make style
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* up
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Add diffusers version and pipeline class to the Hub UA
* Fallback to class name for pipelines
* Update src/diffusers/modeling_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/modeling_flax_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Remove autoclass
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Dummy imports] Better error message
* Test: load pipeline with LMS scheduler.
Fails with a cryptic message if scipy is not installed.
* Correct
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* mps: alt. implementation for repeat_interleave
* style
* Bump mps version of PyTorch in the documentation.
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Simplify: do not check for device.
* style
* Fix repeat dimensions:
- The unconditional embeddings are always created from a single prompt.
- I was shadowing the batch_size var.
* Split long lines as suggested by Suraj.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* pass norm_num_groups param and add tests
* set resnet_groups for FlaxUNetMidBlock2D
* fixed docstrings
* fixed typo
* using is_flax_available util and created require_flax decorator
* begin text2image script
* loading the datasets, preprocessing & transforms
* handle input features correctly
* add gradient checkpointing support
* fix output names
* run unet in train mode not text encoder
* use no_grad instead of freezing params
* default max steps None
* pad to longest
* don't pad when tokenizing
* fix encode on multi gpu
* fix stupid bug
* add random flip
* add ema
* fix ema
* put ema on cpu
* improve EMA model
* contiguous_format
* don't warp vae and text encode in accelerate
* remove no_grad
* use randn_like
* fix resize
* improve few things
* log epoch loss
* set log level
* don't log each step
* remove max_length from collate
* style
* add report_to option
* make scale_lr false by default
* add grad clipping
* add an option to use 8bit adam
* fix logging in multi-gpu, log every step
* more comments
* remove eval for now
* adress review comments
* add requirements file
* begin readme
* begin readme
* fix typo
* fix push to hub
* populate readme
* update readme
* remove use_auth_token from the script
* address some review comments
* better mixed precision support
* remove redundant to
* create ema model early
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* better description for train_data_dir
* add diffusers in requirements
* update dataset_name_mapping
* update readme
* add inference example
Co-authored-by: anton-l <anton@huggingface.co>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Support deepspeed
* Dreambooth DeepSpeed documentation
* Remove unnecessary casts, documentation
Due to recent commits some casts to half precision are not necessary
anymore.
Mention that DeepSpeed's version of Adam is about 2x faster.
* Review comments
* add accelerate to load models with smaller memory footprint
* remove low_cpu_mem_usage as it is reduntant
* move accelerate init weights context to modelling utils
* add test to ensure results are the same when loading with accelerate
* add tests to ensure ram usage gets lower when using accelerate
* move accelerate logic to single snippet under modelling utils and remove it from configuration utils
* format code using to pass quality check
* fix imports with isor
* add accelerate to test extra deps
* only import accelerate if device_map is set to auto
* move accelerate availability check to diffusers import utils
* format code
* add device map to pipeline abstraction
* lint it to pass PR quality check
* fix class check to use accelerate when using diffusers ModelMixin subclasses
* use low_cpu_mem_usage in transformers if device_map is not available
* NoModuleLayer
* comment out tests
* up
* uP
* finish
* Update src/diffusers/pipelines/stable_diffusion/safety_checker.py
* finish
* uP
* make style
Co-authored-by: Pi Esposito <piero.skywalker@gmail.com>
* debug an exception
if dst_path is not a file, it will raise Exception in the function src_path.samefile:
FileNotFoundError: [Errno 2] No such file or directory: '/home/lilongwei/notebook/onnx_diffusion/vae_decoder/model.onnx'
* Update src/diffusers/onnx_utils.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* handle dtype in vae and image2image pipeline
* fix inpaint in fp16
* dtype should be handled in add_noise
* style
* address review comments
* add simple fast tests to check fp16
* fix test name
* put mask in fp16
This is to ensure that the final latent slices stay somewhat consistent as more changes are introduced into the library.
Signed-off-by: James R T <jamestiotio@gmail.com>
Signed-off-by: James R T <jamestiotio@gmail.com>
* Swap fp16 error to warning
Also remove the associated test
* Formatting
* warn -> warning
* Update src/diffusers/pipeline_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Raise an error when moving an fp16 pipeline to CPU
* Raise an error when moving an fp16 pipeline to CPU
* style
* Update src/diffusers/pipeline_utils.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Update src/diffusers/pipeline_utils.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Improve the message
* cuda
* Update tests/test_pipelines.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* init
* improve add_noise
* [debug start] run slow test
* [debug end]
* quick revert
* Add docstrings and warnings + API tests
* Make the warning less spammy
* renamed single letter variables
* renamed x to meaningful variable in resnet.py
Hello @patil-suraj can you verify it
Thanks
* Reformatted using black
* renamed x to meaningful variable in resnet.py
Hello @patil-suraj can you verify it
Thanks
* reformatted the files
* modified unboundlocalerror in line 374
* removed referenced before error
* renamed single variable x -> hidden_state, p-> pad_value
Co-authored-by: Nikhil A V <nikhilav@Nikhils-MacBook-Pro.local>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Add an argument "negative_prompt"
* Fix argument order
* Fix to use TypeError instead of ValueError
* Removed needless batch_size multiplying
* Fix to multiply by batch_size
* Add truncation=True for long negative prompt
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_onnx.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_onnx.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix styles
* Renamed ucond_tokens to uncond_tokens
* Added description about "negative_prompt"
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add accelerate to load models with smaller memory footprint
* remove low_cpu_mem_usage as it is reduntant
* move accelerate init weights context to modelling utils
* add test to ensure results are the same when loading with accelerate
* add tests to ensure ram usage gets lower when using accelerate
* move accelerate logic to single snippet under modelling utils and remove it from configuration utils
* format code using to pass quality check
* fix imports with isor
* add accelerate to test extra deps
* only import accelerate if device_map is set to auto
* move accelerate availability check to diffusers import utils
* format code
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Conversion script
* ran black
* ran isort
* remove unused import
* map location so everything gets loaded onto CPU before conversion
* ran black again
* Update setup.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Don't use `load_state_dict` if torch is not installed.
* Define `SchedulerOutput` to use torch or flax arrays.
* Don't import LMSDiscreteScheduler without torch.
* Create distinct FlaxSchedulerOutput.
* Additional changes required for FlaxSchedulerMixin
* Do not import torch pipelines in Flax.
* Revert "Define `SchedulerOutput` to use torch or flax arrays."
This reverts commit f653140134.
* Prefix Flax scheduler outputs for consistency.
* make style
* FlaxSchedulerOutput is now a dataclass.
* Don't use f-string without placeholders.
* Add blank line.
* Style (docstrings)
* Add callback parameters for Stable Diffusion pipelines
Signed-off-by: James R T <jamestiotio@gmail.com>
* Lint code with `black --preview`
Signed-off-by: James R T <jamestiotio@gmail.com>
* Refactor callback implementation for Stable Diffusion pipelines
* Fix missing imports
Signed-off-by: James R T <jamestiotio@gmail.com>
* Fix documentation format
Signed-off-by: James R T <jamestiotio@gmail.com>
* Add kwargs parameter to standardize with other pipelines
Signed-off-by: James R T <jamestiotio@gmail.com>
* Modify Stable Diffusion pipeline callback parameters
Signed-off-by: James R T <jamestiotio@gmail.com>
* Remove useless imports
Signed-off-by: James R T <jamestiotio@gmail.com>
* Change types for timestep and onnx latents
* Fix docstring style
* Return decode_latents and run_safety_checker back into __call__
* Remove unused imports
* Add intermediate state tests for Stable Diffusion pipelines
Signed-off-by: James R T <jamestiotio@gmail.com>
* Fix intermediate state tests for Stable Diffusion pipelines
Signed-off-by: James R T <jamestiotio@gmail.com>
Signed-off-by: James R T <jamestiotio@gmail.com>
* Allow resolutions that are not multiples of 64
* ran black
* fix bug
* add test
* more explanation
* more comments
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* initial commit
* make UNet stream capturable
* try to fix noise_pred value
* remove cuda graph and keep NB
* non blocking unet with PNDMScheduler
* make timesteps np arrays for pndm scheduler
because lists don't get formatted to tensors in `self.set_format`
* make max async in pndm
* use channel last format in unet
* avoid moving timesteps device in each unet call
* avoid memcpy op in `get_timestep_embedding`
* add `channels_last` kwarg to `DiffusionPipeline.from_pretrained`
* update TODO
* replace `channels_last` kwarg with `memory_format` for more generality
* revert the channels_last changes to leave it for another PR
* remove non_blocking when moving input ids to device
* remove blocking from all .to() operations at beginning of pipeline
* fix merging
* fix merging
* model can run in other precisions without autocast
* attn refactoring
* Revert "attn refactoring"
This reverts commit 0c70c0e189.
* remove restriction to run conv_norm in fp32
* use `baddbmm` instead of `matmul`for better in attention for better perf
* removing all reshapes to test perf
* Revert "removing all reshapes to test perf"
This reverts commit 006ccb8a8c.
* add shapes comments
* hardcore whats needed for jitting
* Revert "hardcore whats needed for jitting"
This reverts commit 2fa9c698ea.
* Revert "remove restriction to run conv_norm in fp32"
This reverts commit cec592890c.
* revert using baddmm in attention's forward
* cleanup comment
* remove restriction to run conv_norm in fp32. no quality loss was noticed
This reverts commit cc9bc1339c.
* add more optimizations techniques to docs
* Revert "add shapes comments"
This reverts commit 31c58eadb8.
* apply suggestions
* make quality
* apply suggestions
* styling
* `scheduler.timesteps` are now arrays so we dont need .to()
* remove useless .type()
* use mean instead of max in `test_stable_diffusion_inpaint_pipeline_k_lms`
* move scheduler timestamps to correct device if tensors
* add device to `set_timesteps` in LMSD scheduler
* `self.scheduler.set_timesteps` now uses device arg for schedulers that accept it
* quick fix
* styling
* remove kwargs from schedulers `set_timesteps`
* revert to using max in K-LMS inpaint pipeline test
* Revert "`self.scheduler.set_timesteps` now uses device arg for schedulers that accept it"
This reverts commit 00d5a51e5c.
* move timesteps to correct device before loop in SD pipeline
* apply previous fix to other SD pipelines
* UNet now accepts tensor timesteps even on wrong device, to avoid errors
- it shouldnt affect performance if timesteps are alrdy on correct device
- it does slow down performance if they're on the wrong device
* fix pipeline when timesteps are arrays with strides
* correcting the beta value assignment
* updating DDIM and LMSDiscreteFlax schedulers
* bringing back the changes that were lost as part of main branch merge
* Replace deprecation warning f-string with class name.
When `__repr__` is invoked in the instance serialization of
`config_dict` fails, because it contains `kwargs` of type `<class
inspect._empty>`.
* Revert "Replace deprecation warning f-string with class name."
This reverts commit 1c4eb8cb10.
* Do not attempt to register `"kwargs"` as an attribute.
Otherwise serialization could fail.
This may happen for other attributes, so we should create a better
solution.
* pytorch only schedulers
* fix style
* remove match_shape
* pytorch only ddpm
* remove SchedulerMixin
* remove numpy from karras_ve
* fix types
* remove numpy from lms_discrete
* remove numpy from pndm
* fix typo
* remove mixin and numpy from sde_vp and ve
* remove remaining tensor_format
* fix style
* sigmas has to be torch tensor
* removed set_format in readme
* remove set format from docs
* remove set_format from pipelines
* update tests
* fix typo
* continue to use mixin
* fix imports
* removed unsed imports
* match shape instead of assuming image shapes
* remove import typo
* update call to add_noise
* use math instead of numpy
* fix t_index
* removed commented out numpy tests
* timesteps needs to be discrete
* cast timesteps to int in flax scheduler too
* fix device mismatch issue
* small fix
* Update src/diffusers/schedulers/scheduling_pndm.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* WIP: flax FlaxDiffusionPipeline & FlaxStableDiffusionPipeline
* todo comment
* Fix imports
* Fix imports
* add dummies
* Fix empty init
* make pipeline work
* up
* Allow dtype to be overridden on model load.
This may be a temporary solution until #567 is addressed.
* Convert params to bfloat16 or fp16 after loading.
This deals with the weights, not the model.
* Use Flax schedulers (typing, docstring)
* PNDM: replace control flow with jax functions.
Otherwise jitting/parallelization don't work properly as they don't know
how to deal with traced objects.
I temporarily removed `step_prk`.
* Pass latents shape to scheduler set_timesteps()
PNDMScheduler uses it to reserve space, other schedulers will just
ignore it.
* Wrap model imports inside availability checks.
* Optionally return state in from_config.
Useful for Flax schedulers.
* Do not convert model weights to dtype.
* Re-enable PRK steps with functional implementation.
Values returned still not verified for correctness.
* Remove left over has_state var.
* make style
* Apply suggestion list -> tuple
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Apply suggestion list -> tuple
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Remove unused comments.
* Use zeros instead of empty.
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Return encoded texts by DiffusionPipelines
* Updated README to show hot to use enoded_text_input
* Reverted examples in README.md
* Reverted all
* Warning for long prompts
* Fix bugs
* Formatted
* refactor: pipelines readability improvements
Signed-off-by: Ryan Russell <git@ryanrussell.org>
* docs: remove todo comment from flax pipeline
Signed-off-by: Ryan Russell <git@ryanrussell.org>
Signed-off-by: Ryan Russell <git@ryanrussell.org>
* Adding pred_original_sample to SchedulerOutput of DDPMScheduler, DDIMScheduler, LMSDiscreteScheduler, KarrasVeScheduler step methods so we can access the predicted denoised outputs
* Gave DDPMScheduler, DDIMScheduler and LMSDiscreteScheduler their own output dataclasses so the default SchedulerOutput in scheduling_utils does not need pred_original_sample as an optional extra
* Reordered library imports to follow standard
* didnt get import order quite right apparently
* Forgot to change name of LMSDiscreteSchedulerOutput
* Aha, needed some extra libs for make style to fully work
* add grad ckpt to downsample blocks
* make it work
* don't pass gradient_checkpointing to upsample block
* add tests for UNet2DConditionModel
* add test_gradient_checkpointing
* add gradient_checkpointing for up and down blocks
* add functions to enable and disable grad ckpt
* remove the forward argument
* better naming
* make supports_gradient_checkpointing private
* Optionally return state in from_config.
Useful for Flax schedulers.
* has_state is now a property, make check more strict.
I don't check the class is `SchedulerMixin` to prevent circular
dependencies. It should be enough that the class name starts with "Flax"
the object declares it "has_state" and the "create_state" exists too.
* Use state in pipeline from_pretrained.
* Make style
* Fix typo in docstring.
* Allow dtype to be overridden on model load.
This may be a temporary solution until #567 is addressed.
* Create latents in float32
The denoising loop always computes the next step in float32, so this
would fail when using `bfloat16`.
* WIP: flax FlaxDiffusionPipeline & FlaxStableDiffusionPipeline
* todo comment
* Fix imports
* Fix imports
* add dummies
* Fix empty init
* make pipeline work
* up
* Use Flax schedulers (typing, docstring)
* Wrap model imports inside availability checks.
* more updates
* make sure flax is not broken
* make style
* more fixes
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@latenitesoft.com>
* first commit:
- add `from_pt` argument in `from_pretrained` function
- add `modeling_flax_pytorch_utils.py` file
* small nit
- fix a small nit - to not enter in the second if condition
* major changes
- modify FlaxUnet modules
- first conversion script
- more keys to be matched
* keys match
- now all keys match
- change module names for correct matching
- upsample module name changed
* working v1
- test pass with atol and rtol= `4e-02`
* replace unsued arg
* make quality
* add small docstring
* add more comments
- add TODO for embedding layers
* small change
- use `jnp.expand_dims` for converting `timesteps` in case it is a 0-dimensional array
* add more conditions on conversion
- add better test to check for keys conversion
* make shapes consistent
- output `img_w x img_h x n_channels` from the VAE
* Revert "make shapes consistent"
This reverts commit 4cad1aeb4a.
* fix unet shape
- channels first!
* Unify offset configuration in DDIM and PNDM schedulers
* Format
Add missing variables
* Fix pipeline test
* Update src/diffusers/schedulers/scheduling_ddim.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Default set_alpha_to_one to false
* Format
* Add tests
* Format
* add deprecation warning
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix is_onnx_available
Fix: If user install onnxruntime-gpu, is_onnx_available() will return False.
* add more onnxruntime candidates
* Run `make style`
Co-authored-by: anton-l <anton@huggingface.co>
* begin text2img conversion script
* add fn to convert config
* create config if not provided
* update imports and use UNet2DConditionModel
* fix imports, layer names
* fix unet coversion
* add function to convert VAE
* fix vae conversion
* update main
* create text model
* update config creating logic for unet
* fix config creation
* update script to create and save pipeline
* remove unused imports
* fix checkpoint loading
* better name
* save progress
* finish
* up
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* First UNet Flax modeling blocks.
Mimic the structure of the PyTorch files.
The model classes themselves need work, depending on what we do about
configuration and initialization.
* Remove FlaxUNet2DConfig class.
* ignore_for_config non-config args.
* Implement `FlaxModelMixin`
* Use new mixins for Flax UNet.
For some reason the configuration is not correctly applied; the
signature of the `__init__` method does not contain all the parameters
by the time it's inspected in `extract_init_dict`.
* Import `FlaxUNet2DConditionModel` if flax is available.
* Rm unused method `framework`
* Update src/diffusers/modeling_flax_utils.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Indicate types in flax.struct.dataclass as pointed out by @mishig25
Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
* Fix typo in transformer block.
* make style
* some more changes
* make style
* Add comment
* Update src/diffusers/modeling_flax_utils.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Rm unneeded comment
* Update docstrings
* correct ignore kwargs
* make style
* Update docstring examples
* Make style
* Style: remove empty line.
* Apply style (after upgrading black from pinned version)
* Remove some commented code and unused imports.
* Add init_weights (not yet in use until #513).
* Trickle down deterministic to blocks.
* Rename q, k, v according to the latest PyTorch version.
Note that weights were exported with the old names, so we need to be
careful.
* Flax UNet docstrings, default props as in PyTorch.
* Fix minor typos in PyTorch docstrings.
* Use FlaxUNet2DConditionOutput as output from UNet.
* make style
Co-authored-by: Mishig Davaadorj <dmishig@gmail.com>
Co-authored-by: Mishig Davaadorj <mishig.davaadorj@coloradocollege.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* beta never changes removed from state
* fix typos in docs
* removed unused var
* initial ddim flax scheduler
* import
* added dummy objects
* fix style
* fix typo
* docs
* fix typo in comment
* set return type
* added flax ddom
* fix style
* remake
* pass PRNG key as argument and split before use
* fix doc string
* use config
* added flax Karras VE scheduler
* make style
* fix dummy
* fix ndarray type annotation
* replace returns a new state
* added lms_discrete scheduler
* use self.config
* add_noise needs state
* use config
* use config
* docstring
* added flax score sde ve
* fix imports
* fix typos
* add different method for sliced attention
* Update src/diffusers/models/attention.py
* Apply suggestions from code review
* Update src/diffusers/models/attention.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* initial attempt at solving
* fix pndm power of 3 inference_step
* add power of 3 test
* fix index in pndm test, remove ddim test
* add comments, change to round()
* update expected results of slow tests
* relax sum and mean tests
* Print shapes when reporting exception
* formatting
* fix sentence
* relax test_stable_diffusion_fast_ddim for gpu fp16
* relax flakey tests on GPU
* added comment on large tolerences
* black
* format
* set scheduler seed
* added generator
* use np.isclose
* set num_inference_steps to 50
* fix dep. warning
* update expected_slice
* preprocess if image
* updated expected results
* updated expected from CI
* pass generator to VAE
* undo change back to orig
* use orignal
* revert back the expected on cpu
* revert back values for CPU
* more undo
* update result after using gen
* update mean
* set generator for mps
* update expected on CI server
* undo
* use new seed every time
* cpu manual seed
* reduce num_inference_steps
* style
* use generator for randn
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* renamed variable names
q -> query
k -> key
v -> value
b -> batch
c -> channel
h -> height
w -> weight
* rename variable names
missed some in the initial commit
* renamed more variable names
As per code review suggestions, renamed x -> hidden_states and x_in -> residual
* fixed minor typo
* docs for attention
* types for embeddings
* unet2d docstrings
* UNet2DConditionModel docstrings
* fix typos
* style and vq-vae docstrings
* docstrings for VAE
* Update src/diffusers/models/unet_2d.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
* added inherits from sentence
* docstring to forward
* make style
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* finish model docs
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Initial support for mps in Stable Diffusion pipeline.
* Initial "warmup" implementation when using mps.
* Make some deterministic tests pass with mps.
* Disable training tests when using mps.
* SD: generate latents in CPU then move to device.
This is especially important when using the mps device, because
generators are not supported there. See for example
https://github.com/pytorch/pytorch/issues/84288.
In addition, the other pipelines seem to use the same approach: generate
the random samples then move to the appropriate device.
After this change, generating an image in MPS produces the same result
as when using the CPU, if the same seed is used.
* Remove prints.
* Pass AutoencoderKL test_output_pretrained with mps.
Sampling from `posterior` must be done in CPU.
* Style
* Do not use torch.long for log op in mps device.
* Perform incompatible padding ops in CPU.
UNet tests now pass.
See https://github.com/pytorch/pytorch/issues/84535
* Style: fix import order.
* Remove unused symbols.
* Remove MPSWarmupMixin, do not apply automatically.
We do apply warmup in the tests, but not during normal use.
This adopts some PR suggestions by @patrickvonplaten.
* Add comment for mps fallback to CPU step.
* Add README_mps.md for mps installation and use.
* Apply `black` to modified files.
* Restrict README_mps to SD, show measures in table.
* Make PNDM indexing compatible with mps.
Addresses #239.
* Do not use float64 when using LDMScheduler.
Fixes#358.
* Fix typo identified by @patil-suraj
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Adapt example to new output style.
* Restore 1:1 results reproducibility with CompVis.
However, mps latents need to be generated in CPU because generators
don't work in the mps device.
* Move PyTorch nightly to requirements.
* Adapt `test_scheduler_outputs_equivalence` ton MPS.
* mps: skip training tests instead of ignoring silently.
* Make VQModel tests pass on mps.
* mps ddim tests: warmup, increase tolerance.
* ScoreSdeVeScheduler indexing made mps compatible.
* Make ldm pipeline tests pass using warmup.
* Style
* Simplify casting as suggested in PR.
* Add Known Issues to readme.
* `isort` import order.
* Remove _mps_warmup helpers from ModelMixin.
And just make changes to the tests.
* Skip tests using unittest decorator for consistency.
* Remove temporary var.
* Remove spurious blank space.
* Remove unused symbol.
* Remove README_mps.
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Initial version of `fp16` page.
* Fix typo in README.
* Change titles of fp16 section in toctree.
* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Clarify attention slicing is useful even for batches of 1
Explained by @patrickvonplaten after a suggestion by @keturn.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Do not talk about `batches` in `enable_attention_slicing`.
* Use Tip (just for fun), add link to method.
* Comment about fp16 results looking the same as float32 in practice.
* Style: docstring line wrapping.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* init schedulers docs
* add some docstrings, fix sidebar formatting
* add docstrings
* [Type hint] PNDM schedulers (#335)
* [Type hint] PNDM Schedulers
* ran make style
* updated timesteps type hint
* apply suggestions from code review
* ran make style
* removed unused import
* [Type hint] scheduling ddim (#343)
* [Type hint] scheduling ddim
* apply suggestions from code review
apply suggestions to also return the return type
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* make style
* update class docstrings
* add docstrings
* missed merge edit
* add general docs page
* modify headings for right sidebar
Co-authored-by: Partho <parthodas6176@gmail.com>
Co-authored-by: Santiago Víquez <santi.viquez@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Initial description of Stable Diffusion pipeline.
* Placeholder docstrings to test preview.
* Add docstrings to Stable Diffusion pipeline.
* Style
* Docs for all the SD pipelines + attention slicing.
* Style: wrap long lines.
* Update text_inversion.mdx
Getting in a bit of background info
* fixed typo mode -> model
* Link SD and re-write a few bits for clarity
* Copied in info from the example script
As suggested by surajpatil :)
* removed an unnecessary heading
Use `expand` instead of ones to broadcast tensor.
As suggested by @bes-dev. According the documentation this shouldn't
take any memory - it just plays with the strides.
* [Type hint] scheduling ddim
* apply suggestions from code review
apply suggestions to also return the return type
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Type hint] PNDM Schedulers
* ran make style
* updated timesteps type hint
* apply suggestions from code review
* ran make style
* removed unused import
* Use ONNX / Core ML compatible method to broadcast.
Unfortunately `tile` could not be used either, it's still not compatible
with ONNX.
See #284.
* Add comment about why broadcast_to is not used.
Also, apply style to changed files.
* Make sure broadcast remains in same device.
* Fix tqdm and OOM
* tqdm auto
* tqdm is still spamming try to disable it altogether
* rather just set the pipe config, to keep the global tqdm clean
* style
* add textual inversion script
* make the loop work
* make coarse_loss optional
* save pipeline after training
* add arg pretrained_model_name_or_path
* fix saving
* fix gradient_accumulation_steps
* style
* fix progress bar steps
* scale lr
* add argument to accept style
* remove unused args
* scale lr using num gpus
* load tokenizer using args
* add checks when converting init token to id
* improve commnets and style
* document args
* more cleanup
* fix default adamw arsg
* TextualInversionWrapper -> CLIPTextualInversionWrapper
* fix tokenizer loading
* Use the CLIPTextModel instead of wrapper
* clean dataset
* remove commented code
* fix accessing grads for multi-gpu
* more cleanup
* fix saving on multi-GPU
* init_placeholder_token_embeds
* add seed
* fix flip
* fix multi-gpu
* add utility methods in wrapper
* remove ipynb
* don't use wrapper
* dont pass vae an dunet to accelerate prepare
* bring back accelerator.accumulate
* scale latents
* use only one progress bar for steps
* push_to_hub at the end of training
* remove unused args
* log some important stats
* store args in tensorboard
* pretty comments
* save the trained embeddings
* mobe the script up
* add requirements file
* more cleanup
* fux typo
* begin readme
* style -> learnable_property
* keep vae and unet in eval mode
* address review comments
* address more comments
* removed unused args
* add train command in readme
* update readme
* Changed variable name from "h" to "hidden_states"
Per issue #198 , changed variable name from "h" to "hidden_states" in the forward function only. I am happy to change any other variable names, please advise recommended new names.
* Update src/diffusers/models/resnet.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Refactor] Remove set_seed and class attributes
* apply anton's suggestiosn
* fix
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* up
* update
* make style
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* make fix-copies
* make style
* make style and new copies
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* format timesteps attrs to np arrays in pndm scheduler
because lists don't get formatted to tensors in `self.set_format`
* convert to long type to use timesteps as indices for tensors
* add scheduler set_format test
* fix `_timesteps` type
* make style with black 22.3.0 and isort 5.10.1
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Type hint] Karras VE pipeline
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Helpful exception if inference steps not set in schedulers (#263)
* Apply suggestions from codereview by patrickvonplaten
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Refactor progress bar of pipeline __call__
* Make any tqdm configs available
* remove init
* add some tests
* remove file
* finish
* make style
* improve progress bar test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Init CI
* clarify cpu
* style
* Check scripts quality too
* Drop smi for cpu tests
* Run PR tests on cpu docker envs
* Update .github/workflows/push_tests.yml
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Try minimal python container
* Print env, install stable GPU torch
* Manual torch install
* remove deprecated platform.dist()
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Accept latents as input for StableDiffusionPipeline.
* Notebook to demonstrate reusable seeds (latents).
* More accurate type annotation
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Review comments: move to device, raise instead of assert.
* Actually commit the test notebook.
I had mistakenly pushed an empty file instead.
* Adapt notebook to Colab.
* Update examples readme.
* Move notebook to personal repo.
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Restore `is_modelcards_available` in `.utils`.
Otherwise attempting to import `hub_utils` (in training scripts, for
example), fails.
This was removed during the refactor in df90f0c.
* Implement `pipeline.to(device)`
* DiffusionPipeline.to() decides best device on None.
* Breaking change: torch_device removed from __call__
`pipeline.to()` now has PyTorch semantics.
* Use kwargs and deprecation notice
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Apply torch_device compatibility to all pipelines.
* style
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: anton-l <anton@huggingface.co>
* add SafetyChecker
* better name, fix checker
* add checker in main init
* remove from main init
* update logic to detect pipeline module
* style
* handle all safety logic in safety checker
* draw text
* can't draw
* small fixes
* treat special care as nsfw
* remove commented lines
* update safety checker
* Allow passing non-default modules to pipeline.
Override modules are recognized and replaced in the pipeline. However,
no check is performed about mismatched classes yet. This is because the
override module is already instantiated and we have no library or class
name to compare against.
* up
* add test
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add stable diffusion pipeline
* get rid of multiple if/else
* batch_size is unused
* add type hints
* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
* fix some bugs
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* karras + VE, not flexible yet
* Fix inputs incompatibility with the original unet
* Roll back sigma scaling
* Apply suggestions from code review
* Old comment
* Fix doc
* Extented the ability of ddpm scheduler
to utilize model that also predict the variance.
* Update src/diffusers/schedulers/scheduling_ddpm.py
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* [Vae and AutoencoderKL clean]
* save intermediate finished work
* more progress
* more progress
* finish modeling code
* save intermediate
* finish
* Correct tests
Hey, I really liked the project and was reading through the Readme.md file when I came across some spelling and grammatical errors that you might have missed while editing the documentation. It would be really a great opportunity for me if I could contribute to this project. Thank you.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Update README.md
line 23, 24 and 25: Remove "that" because "that" is unnecessary in these three sentences.
line 33: Rewrite this sentence and make it more straightforward.
line 34: This first sentence is incomplete.
line 117: “focusses" -> "focuses"
line 118: "continuous" -> "continuous"
line 119: "consise" -> "concise"
* Update README.md
In https://github.com/huggingface/diffusers/issues/124 I incorrectly suggested that the image set creation process was undocumented. In reality, I just hadn't located it. @patrickvonplaten did so for me.
This PR places a hotlink so that people like me can be shoehorned over where they needed to be.
* work in progress, fixing tests for numpy and make deterministic
* make tests pass via pytorch
* make pytorch == numpy test cleaner
* change default tensor format pndm --> pt
* up
* change model name
* renaming
* more changes
* up
* up
* up
* save checkpoint
* finish api / naming
* finish config renaming
* rename all weights
* finish really
* clean ddpm api to match ddim
* correct ve sde class
* update pipeline API for ve sde
* make style
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* improve comments for sde_ve scheduler, init tests
* more comments, tweaking pipelines
* timesteps --> num_training_timesteps, some comments
* merge cpu test, add m1 data
* fix scheduler tests with num_train_timesteps
* make np compatible, add tests for sde ve
* minor default variable fixes
* make style and fix-copies
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
The docs will be viewable at [http://localhost:3000](http://localhost:3000). You can also preview the docs once you have opened a PR. You will see a bot add a comment to a link where the documentation with your changes lives.
### Translating the Diffusers documentation into your language
As part of our mission to democratize machine learning, we'd love to make the Diffusers library available in many more languages! Follow the steps below if you want to help translate the documentation into your language 🙏.
**🗞️ Open an issue**
To get started, navigate to the [Issues](https://github.com/huggingface/diffusers/issues) page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the "Translation template" from the "New issue" button.
Once an issue exists, post a comment to indicate which chapters you'd like to work on, and we'll add your name to the list.
**🍴 Fork the repository**
First, you'll need to [fork the Diffusers repo](https://docs.github.com/en/get-started/quickstart/fork-a-repo). You can do this by clicking on the **Fork** button on the top-right corner of this repo's page.
Once you've forked the repo, you'll want to get the files on your local machine for editing. You can do that by cloning the fork with Git as follows:
**📋 Copy-paste the English version with a new language code**
The documentation files are in one leading directory:
- [`docs/source`](https://github.com/huggingface/diffusers/tree/main/docs/source): All the documentation materials are organized here by language.
You'll only need to copy the files in the [`docs/source/en`](https://github.com/huggingface/diffusers/tree/main/docs/source/en) directory, so first navigate to your fork of the repo and run the following:
```bash
cd ~/path/to/diffusers/docs
cp -r source/en source/LANG-ID
```
Here, `LANG-ID` should be one of the ISO 639-1 or ISO 639-2 language codes -- see [here](https://www.loc.gov/standards/iso639-2/php/code_list.php) for a handy table.
**✍️ Start translating**
The fun part comes - translating the text!
The first thing we recommend is translating the part of the `_toctree.yml` file that corresponds to your doc chapter. This file is used to render the table of contents on the website.
> 🙋 If the `_toctree.yml` file doesn't yet exist for your language, you can create one by copy-pasting from the English version and deleting the sections unrelated to your chapter. Just make sure it exists in the `docs/source/LANG-ID/` directory!
The fields you should add are `local` (with the name of the file containing the translation; e.g. `autoclass_tutorial`), and `title` (with the title of the doc in your language; e.g. `Load pretrained instances with an AutoClass`) -- as a reference, here is the `_toctree.yml` for [English](https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml):
```yaml
- sections:
- local:pipeline_tutorial# Do not change this! Use the same name for your .md file
title:Pipelines for inference# Translate this!
...
title:Tutorials# Translate this!
```
Once you have translated the `_toctree.yml` file, you can start translating the [MDX](https://mdxjs.com/) files associated with your docs chapter.
> 🙋 If you'd like others to help you with the translation, you should [open an issue](https://github.com/huggingface/diffusers/issues) and tag @patrickvonplaten.
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
## Intro
Stable Diffusion is a [Latent Diffusion model](https://github.com/CompVis/latent-diffusion) developed by researchers from the Machine Vision and Learning group at LMU Munich, *a.k.a* CompVis.
Model checkpoints were publicly released at the end of August 2022 by a collaboration of Stability AI, CompVis, and Runway with support from EleutherAI and LAION. For more information, you can check out [the official blog post](https://stability.ai/blog/stable-diffusion-public-release).
Since its public release the community has done an incredible job at working together to make the stable diffusion checkpoints **faster**, **more memory efficient**, and **more performant**.
🧨 Diffusers offers a simple API to run stable diffusion with all memory, computing, and quality improvements.
This notebook walks you through the improvements one-by-one so you can best leverage [`StableDiffusionPipeline`] for **inference**.
## Prompt Engineering 🎨
When running *Stable Diffusion* in inference, we usually want to generate a certain type, or style of image and then improve upon it. Improving upon a previously generated image means running inference over and over again with a different prompt and potentially a different seed until we are happy with our generation.
So to begin with, it is most important to speed up stable diffusion as much as possible to generate as many pictures as possible in a given amount of time.
This can be done by both improving the **computational efficiency** (speed) and the **memory efficiency** (GPU RAM).
Let's start by looking into computational efficiency first.
Throughout the notebook, we will focus on [runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5):
We aim at generating a beautiful photograph of an *old warrior chief* and will later try to find the best prompt to generate such a photograph. For now, let's keep the prompt simple:
``` python
prompt = "portrait photo of a old warrior chief"
```
To begin with, we should make sure we run inference on GPU, so let's move the pipeline to GPU, just like you would with any PyTorch module.
``` python
pipe = pipe.to("cuda")
```
To generate an image, you should use the [~`StableDiffusionPipeline.__call__`] method.
To make sure we can reproduce more or less the same image in every call, let's make use of the generator. See the documentation on reproducibility [here](./conceptual/reproducibility) for more information.
Cool, this now took roughly 30 seconds on a T4 GPU (you might see faster inference if your allocated GPU is better than a T4).
The default run we did above used full float32 precision and ran the default number of inference steps (50). The easiest speed-ups come from switching to float16 (or half) precision and simply running fewer inference steps. Let's load the model now in float16 instead.
Cool, this is almost three times as fast for arguably the same image quality.
We strongly suggest always running your pipelines in float16 as so far we have very rarely seen degradations in quality because of it.
Next, let's see if we need to use 50 inference steps or whether we could use significantly fewer. The number of inference steps is associated with the denoising scheduler we use. Choosing a more efficient scheduler could help us decrease the number of steps.
Let's have a look at all the schedulers the stable diffusion pipeline is compatible with.
🧨 Diffusers is constantly adding a bunch of novel schedulers/samplers that can be used with Stable Diffusion. For more information, we recommend taking a look at the official documentation [here](https://huggingface.co/docs/diffusers/main/en/api/schedulers/overview).
Alright, right now Stable Diffusion is using the `PNDMScheduler` which usually requires around 50 inference steps. However, other schedulers such as `DPMSolverMultistepScheduler` or `DPMSolverSinglestepScheduler` seem to get away with just 20 to 25 inference steps. Let's try them out.
You can set a new scheduler by making use of the [from_config](https://huggingface.co/docs/diffusers/main/en/api/configuration#diffusers.ConfigMixin.from_config) function.
The image now does look a little different, but it's arguably still of equally high quality. We now cut inference time to just 4 seconds though 😍.
## Memory Optimization
Less memory used in generation indirectly implies more speed, since we're often trying to maximize how many images we can generate per second. Usually, the more images per inference run, the more images per second too.
The easiest way to see how many images we can generate at once is to simply try it out, and see when we get a *"Out-of-memory (OOM)"* error.
We can run batched inference by simply passing a list of prompts and generators. Let's define a quick function that generates a batch for us.
``` python
def get_inputs(batch_size=1):
generator = [torch.Generator("cuda").manual_seed(i) for i in range(batch_size)]
Going over a batch_size of 4 will error out in this notebook (assuming we are running it on a T4 GPU). Also, we can see we only generate slightly more images per second (3.75s/image) compared to 4s/image previously.
However, the community has found some nice tricks to improve the memory constraints further. After stable diffusion was released, the community found improvements within days and shared them freely over GitHub - open-source at its finest! I believe the original idea came from [this](https://github.com/basujindal/stable-diffusion/pull/117) GitHub thread.
By far most of the memory is taken up by the cross-attention layers. Instead of running this operation in batch, one can run it sequentially to save a significant amount of memory.
It can easily be enabled by calling `enable_attention_slicing` as is documented [here](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/text2img#diffusers.StableDiffusionPipeline.enable_attention_slicing).
``` python
pipe.enable_attention_slicing()
```
Great, now that attention slicing is enabled, let's try to double the batch size again, going for `batch_size=8`.
Nice, it works. However, the speed gain is again not very big (it might however be much more significant on other GPUs).
We're at roughly 3.5 seconds per image 🔥 which is probably the fastest we can be with a simple T4 without sacrificing quality.
Next, let's look into how to improve the quality!
## Quality Improvements
Now that our image generation pipeline is blazing fast, let's try to get maximum image quality.
First of all, image quality is extremely subjective, so it's difficult to make general claims here.
The most obvious step to take to improve quality is to use *better checkpoints*. Since the release of Stable Diffusion, many improved versions have been released, which are summarized here:
- *Official Release - 22 Aug 2022*: [Stable-Diffusion 1.4](https://huggingface.co/CompVis/stable-diffusion-v1-4)
- *20 October 2022*: [Stable-Diffusion 1.5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
- *24 Nov 2022*: [Stable-Diffusion 2.0](https://huggingface.co/stabilityai/stable-diffusion-2-0)
- *7 Dec 2022*: [Stable-Diffusion 2.1](https://huggingface.co/stabilityai/stable-diffusion-2-1)
Newer versions don't necessarily mean better image quality with the same parameters. People mentioned that *2.0* is slightly worse than *1.5* for certain prompts, but given the right prompt engineering *2.0* and *2.1* seem to be better.
Overall, we strongly recommend just trying the models out and reading up on advice online (e.g. it has been shown that using negative prompts is very important for 2.0 and 2.1 to get the highest possible quality. See for example [this nice blog post](https://minimaxir.com/2022/11/stable-diffusion-negative-prompt/).
Additionally, the community has started fine-tuning many of the above versions on certain styles with some of them having an extremely high quality and gaining a lot of traction.
We recommend having a look at all [diffusers checkpoints sorted by downloads and trying out the different checkpoints](https://huggingface.co/models?library=diffusers).
For the following, we will stick to v1.5 for simplicity.
Next, we can also try to optimize single components of the pipeline, e.g. switching out the latent decoder. For more details on how the whole Stable Diffusion pipeline works, please have a look at [this blog post](https://huggingface.co/blog/stable_diffusion).
Pretty impressive! We got some very high-quality image generations there. The 2nd image is my personal favorite, so I'll re-use this seed and see whether I can tweak the prompts slightly by using "oldest warrior", "old", "", and "young" instead of "old".
``` python
prompts = [
"portrait photo of the oldest warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
"portrait photo of a old warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
"portrait photo of a warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
"portrait photo of a young warrior chief, tribal panther make up, blue on red, side profile, looking away, serious eyes 50mm portrait photography, hard rim lighting photography--beta --ar 2:3 --beta --upbeta",
]
generator = [torch.Generator("cuda").manual_seed(1) for _ in range(len(prompts))] # 1 because we want the 2nd image
The first picture looks nice! The eye movement slightly changed and looks nice. This finished up our 101-guide on how to use Stable Diffusion 🤗.
For more information on optimization or other guides, I recommend taking a look at the following:
- [Blog post about Stable Diffusion](https://huggingface.co/blog/stable_diffusion): In-detail blog post explaining Stable Diffusion.
- [FlashAttention](https://huggingface.co/docs/diffusers/optimization/xformers): XFormers flash attention can optimize your model even further with more speed and memory improvements.
- [Dreambooth](https://huggingface.co/docs/diffusers/training/dreambooth) - Quickly customize the model by fine-tuning it.
- [General info on Stable Diffusion](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/overview) - Info on other tasks that are powered by Stable Diffusion.
🤗 Diffusers는 사전학습된 비전 및 오디오 확산 모델을 제공하고, 추론 및 학습을 위한 모듈식 도구 상자 역할을 합니다.
보다 정확하게, 🤗 Diffusers는 다음을 제공합니다:
- 단 몇 줄의 코드로 추론을 실행할 수 있는 최신 확산 파이프라인을 제공합니다. ([**Using Diffusers**](./using-diffusers/conditional_image_generation)를 살펴보세요) 지원되는 모든 파이프라인과 해당 논문에 대한 개요를 보려면 [**Pipelines**](#pipelines)을 살펴보세요.
- 추론에서 속도 vs 품질의 절충을 위해 상호교환적으로 사용할 수 있는 다양한 노이즈 스케줄러를 제공합니다. 자세한 내용은 [**Schedulers**](./api/schedulers/overview)를 참고하세요.
- UNet과 같은 여러 유형의 모델을 end-to-end 확산 시스템의 구성 요소로 사용할 수 있습니다. 자세한 내용은 [**Models**](./api/models)을 참고하세요.
- 가장 인기있는 확산 모델 테스크를 학습하는 방법을 보여주는 예제들을 제공합니다. 자세한 내용은 [**Training**](./training/overview)를 참고하세요.
## 🧨 Diffusers 파이프라인
다음 표에는 공시적으로 지원되는 모든 파이프라인, 관련 논문, 직접 사용해 볼 수 있는 Colab 노트북(사용 가능한 경우)이 요약되어 있습니다.
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation |
| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation |
**참고**: 파이프라인은 해당 문서에 설명된 대로 확산 시스템을 사용한 방법에 대한 간단한 예입니다.
- [`accelerate`](https://huggingface.co/docs/accelerate/index)은 추론 및 학습을 위한 모델 불러오기 속도를 높입니다.
- [`transformers`](https://huggingface.co/docs/transformers/index)는 [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview)과 같이 가장 널리 사용되는 확산 모델을 실행하기 위해 필요합니다.
## DiffusionPipeline
[`DiffusionPipeline`]은 추론을 위해 사전학습된 확산 시스템을 사용하는 가장 쉬운 방법입니다. 다양한 양식의 많은 작업에 [`DiffusionPipeline`]을 바로 사용할 수 있습니다. 지원되는 작업은 아래의 표를 참고하세요:
모든 [Diffusers' checkpoint](https://huggingface.co/models?library=diffusers&sort=downloads)에 대해 [`DiffusionPipeline`]을 사용할 수 있습니다.
하지만, 이 가이드에서는 [Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion)을 사용하여 text-to-image를 하는데 [`DiffusionPipeline`]을 사용합니다.
[Stable Diffusion](https://huggingface.co/CompVis/stable-diffusion) 기반 모델을 실행하기 전에 [license](https://huggingface.co/spaces/CompVis/stable-diffusion-license)를 주의 깊게 읽으세요.
이는 모델의 향상된 이미지 생성 기능과 이것으로 생성될 수 있는 유해한 콘텐츠 때문입니다. 선택한 Stable Diffusion 모델(*예*: [`runwayml/stable-diffusion-v1-5`](https://huggingface.co/runwayml/stable-diffusion-v1-5))로 이동하여 라이센스를 읽으세요.
>>> image = generator("An image of a squirrel in Picasso style").images[0]
>>> image.save("image_of_squirrel_painting.png")
```
확산 시스템은 각각 장점이 있는 여러 다른 [schedulers](./api/schedulers/overview)와 함께 사용할 수 있습니다. 기본적으로 Stable Diffusion은 `PNDMScheduler`로 실행되지만 다른 스케줄러를 사용하는 방법은 매우 간단합니다. *예* [`EulerDiscreteScheduler`] 스케줄러를 사용하려는 경우, 다음과 같이 사용할 수 있습니다:
스케줄러 변경 방법에 대한 자세한 내용은 [Using Schedulers](./using-diffusers/schedulers) 가이드를 참고하세요.
[Stability AI's](https://stability.ai/)의 Stable Diffusion 모델은 인상적인 이미지 생성 모델이며 텍스트에서 이미지를 생성하는 것보다 훨씬 더 많은 작업을 수행할 수 있습니다. 우리는 Stable Diffusion만을 위한 전체 문서 페이지를 제공합니다 [link](./conceptual/stable_diffusion).
만약 더 적은 메모리, 더 높은 추론 속도, Mac과 같은 특정 하드웨어 또는 ONNX 런타임에서 실행되도록 Stable Diffusion을 최적화하는 방법을 알고 싶다면 최적화 페이지를 살펴보세요:
- [Optimized PyTorch on GPU](./optimization/fp16)
- [Mac OS with PyTorch](./optimization/mps)
- [ONNX](./optimization/onnx)
- [OpenVINO](./optimization/open_vino)
확산 모델을 미세조정하거나 학습시키려면, [**training section**](./training/overview)을 살펴보세요.
final_image=pipe(image=image,prompt="Black font, white background, vector",noise_level=40,callback=callback)
final_image.save("diffusers_library.jpg")
if__name__=="__main__":
main()
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.