-
Notifications
You must be signed in to change notification settings - Fork 2.8k
seed fuzzing #184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
seed fuzzing #184
Conversation
adds 2 parameters for generating variations of a seed: -z optional 0-1 value to slerp from -S noise to random noise (allows variations on an image) -Z optional target seed that -S noise is slerped to (interpolate one image to another) based on https://github.com/bakkot/stable-diffusion/tree/noise
|
one thing I forgot to mention you can use -n with this, so say you have a great prompt and seed, and want some variations on it, you'd tack on -z with some fuzz value and then -n with some number of iterations.
|
|
Another important note! If you really like one of the variations, you can re-generate it again by using the variation's seed as the target fuzz seed so say you did You can just do |
|
I like the idea, gj |
|
Thank you for the PR. I concur that this is indeed a very powerful feature. I did some testing. And here are my notes. Seed Fuzzing
Seed Fuzz Target This one worked a little bit different than what I thought it would do. I expected -Z to create intermediary images between the source and the target with -z being the strength of that. However I noticed that -z actually creates images off the fuzzed seed rather than the original and lower values of z create an image that is closer to the target than the source and higher values are closer to the fuzzed version of the original seed. Not sure if that's intended but I'd expect it to work the other way around? With -z being the controller for the strength of the interpolation rather than fuzzing of the original seed? This is when provided in combination with -Z. Hope to get some clarification on that. Works great overall. Brilliant PR. Good job. I'll do a review on this once you get back to me. |
I also thought this is how it works. I expected command below to transform original image Example: |
|
I like this and I know that @bakkot will be happy to see his PR #81 vision get back into main. I'm going to wait until there's a response to @blessedcoolant and @morganavr 's comments, and then will do my review. |
|
For the "target seed" thing, I think there's a more general feature which interpolates between arbitrary settings (for the features for which it makes sense to do so, which includes seeds). I'd like that feature to exist (and will implement it if I have time...), which would subsume the "target seed" part of this PR. So maybe just do the "fuzzing" part of this for now? |
I second this. The fuzzing seems to work quiet well. The interpolation is still not working as I'd expect it to. We can do fuzzing in this PR if you think you need more time to get the interpolation right. I'd also recommend we change the name and call it |
Agree, |
We probably don't want to tie one type of variant generation to -v, but rather use -v to pick a variant generator by name. Perhaps this one could be called |
If I'm understanding correctly, you were both expecting 10 output images (-n10), where image 1 is 100% 256~ and image 10 is 75% 578~ ? (and images 2-9 are in-betweens of that?) I can try to have it interpolate, but it would make assumptions (always starting at 0% then going up to -z) and I think would be better handled by a proper general-purpose interpolation of parameters An alternate option is, it could do 10 variations on |
This unfortunately isn't possible with just one seed, when fuzzing, what is happening is noise generated from -S is being slerped toward random noise (or -Z seeded noise). You can still regenerate fuzzed results, but you need to provide -S OriginalSeed -Z RandomSeed (the random seeds are the ones that get written to the filename and png info)
-z is the controller for the strength of the interpolation, 0.0 being original seeded noise, 1.0 being the random seeded noise But I have seen the behaviour where as soon as the slerp hits 0.5 it switches heavily toward the random seeded noise's image. I don't know why, but its similar to what occurs when using weighted prompts. It might be something specific to how SD works I will try to save out just the raw noise and see if the sudden transition at 0.5 is noticeable there too. |
…o seed-fuzzing # Conflicts: # ldm/simplet2i.py # scripts/dream.py
Yes. I expected the interpolation to occur between -S and -Z at the rate of -n that is given.
Do you mean you'll use -z to control the interpolation factor rather than -n ? You want to generate constant 10 images? I think letting the user pick the -n value for the number of variants is a better thing because those images can be used for creating transition effects and etc.
I guess it's not a major deal because we can always get back the same result by supplying the same original seed and -z value.
I've noticed that too. But I presumed it might be this particular PR handling the transition incorrectly. But I guess not. Would like @lstein to weigh in on how to implement this. I think functionally it's almost there. Just deciding the user experience factor of it and going for a final push. Thank you for the new pushes with the asserts. We can aim to release this over the next 24 hrs. |
No. With this prompt below I expect 10 images with 75% variation towards target seed I want to use this feature mainly to create different types of variations of my favorite image:
When I specify Although I can see a use case for the behavior @blessedcoolant mentioned:
It would create a series of images representing gradual interpolation between image A and B. Then these images I assume can be turned into a movie or GIF. Maybe if you specify additional |
code gremlins...
|
Obviously this a great feature for creating variations, but I would like to point out just how HUGE it is going to be for animating. I tested with txt2img, but I assume this will be implemented for img2img as well? I think that is where it will shine the most. For txt2img, considering that the seed affects the composition in its entirety, incrementing -v value even by a small amount will have huge effect. I tested incrementing by 0.005 and it works quite well, however that's a lot of frames to generate if the goal is to create a smooth animation that interpolates between 2 seeds. But for img2img because the overall composition is taken from init image, I suspect it will be able to take larger -v increments and still be able to create a smoother animation than any other methods possible previously. Combined with prompt weighting which we already have, putting everything into a text file and loading using --from_file makes this very easy workflow. This open up a lot of ways of animating. |
You can use the "squash and merge" button on Github to get a clean, single-commit item in the history which links to the original PR. That's the usual flow for OSS projects, in my experience. |
You need to pass small values like
full change in design != variation. It's a totally different image just produced by the same prompt.
I have a feeling @blessedcoolant that during a merge you resolved conflicts wrongly and that's why you local code does not work as in others. |
I made a fresh pull and I did not try values as low as |
|
To me this is working exactly as I would expect. It is not interpolating the 2 final images you would get from each seed, it is interpolating the generated noise from 2 different seeds, which means even a value of 0.1 has high chance to alter composition a lot, unless you get lucky or you manually picked 2 seeds where the final generated images would already be close in composition. To me the real power of this feature will be for animation, here is a quick example from interpolating between 2 handpicked seeds where final image was close in composition: This is from -v0.0 to -v0.9, in 0.02 increments. This breaks at even smaller increments if I had picked a second seed that produced very different image from the first. This is why I think this feature will be pure magic if implemented for img2img as well, where it would be effectively possible to "morph" very smoothly between 2 variants based on composition from init image. |
|
okay, I implemented the behavior mentioned here (saving tensors): #184 (comment) Maybe @xraxra can add this functionality to this PR? Source file: |
|
@morganavr as mentioned above, saving the whole tensor is overkill. You just need to save a list of the original seeds and their weights (in this case there will be exactly two - the input and the target). |
|
@bakkot |
|
I will try to implement the thing I suggested later today. Short summary: the way this PR works is by constructing an "initial noise array" X_t from a weighted average of two arrays of noise generated from two different seeds. If you have both seeds and their weights, you can repeat that process for deriving X_t, and get the same result out. |
can't wait to have a look at your code! |
See my comment #256 (comment) Keeping a clean history is sometimes harder than doing the actual programming work. It's not fun, and sometimes it's downright miserable. It is also the best anti-techdebt activity (esp. for the amount of effort involved) that I know of. lstein is absolutely correct that really understanding, and faithfully using, |
|
Yeah, in cases that there's multiple logic changes, you do have to do the cleanup first. But a lot of the time (like this PR) it's logically a single thing, and you can reasonable just squash instead of worrying about cleaning it up. |
Guys, I was thinking about one feature. Is it possible to construct such "noise array" that will change only a specific part of the image? It sounds like inpainting but maybe it is possible to implement using algorithm from this PR. While using this PR feature I noticed that very often I would love some part of the image to stay the same, and I could use brush and paint over parts of the image... :) So yeah, basically inpainting. But with this "seed fuzzing" feature. |
|
Keep in mind this feature is for txt2img; it doesn't take an image as input at all. With that said, with this feature (or rather a followon), one could in theory make it so that only part of the noise array changed. That would not guarantee that the output for the rest of the image was the same, but it would be more likely. |
I know that this PR is for txt2img. The more it is interesting. It is inpainting inside trained model data because we did not provide external input image. Any idea how to map noise array to image pixels? I would be excited even if this "internal inpainting" for txt2img worked only for 512x512 size. |
I don't have any idea myself; it would require knowing more about the first stage encoder than I do to even tell if it's possible. Anyway, you should open a new issue for this, so we can continue discussion after merging this PR. |
|
I should have an updated version of this PR within a couple hours, so hold off on any further work/reviews for the moment. |
|
Once you have a series of generated images, how easy is it to animate them?
This would make a great alternative to --grid.
Lincoln
…On Wed, Aug 31, 2022 at 5:06 PM thelemuet ***@***.***> wrote:
To me this is working exactly as I would expect. It is not interpolating
the 2 final images you would get from each seed, it is interpolating the
generated noise from 2 different seeds, which means even a value of 0.1 has
high chance to alter composition a lot, unless you get lucky or you
manually picked 2 seeds where the final generated images would already be
close in composition.
To me the real power of this feature will be for animation, here is a
quick example from interpolating between 2 handpicked seeds where final
image was close in composition:
[image: test2]
<https://user-images.githubusercontent.com/86247933/187781637-ae95f234-1330-4b6d-8f02-02e06b49573e.gif>
This is from -v0.0 to -v0.5, in 0.02 increments.
This breaks at even smaller increments if I had picked a second seed that
produced very different image from the first.
This is why I think this feature will be pure magic if implemented for
img2img as well, where it would be effectively possible to "morph" very
smoothly between 2 variants based on composition from init image.
—
Reply to this email directly, view it on GitHub
<#184 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAA3EVNNFUANF3AXYBXTEQ3V37CNXANCNFSM5757INMQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
*Lincoln Stein*
Head, Adaptive Oncology, OICR
Senior Principal Investigator, OICR
Professor, Department of Molecular Genetics, University of Toronto
Tel: 416-673-8514
Cell: 416-817-8240
***@***.***
*E**xecutive Assistant*
Michelle Xin
Tel: 647-260-7927
***@***.*** ***@***.***>*
*Ontario Institute for Cancer Research*
MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario, Canada M5G
0A3
@OICR_news
<https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Foicr_news&data=04%7C01%7CMichelle.Xin%40oicr.on.ca%7C9fa8636ff38b4a60ff5a08d926dd2113%7C9df949f8a6eb419d9caa1f8c83db674f%7C0%7C0%7C637583553462287559%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PS9KzggzFoecbbt%2BZQyhkWkQo9D0hHiiujsbP7Idv4s%3D&reserved=0>
| www.oicr.on.ca
*Collaborate. Translate. Change lives.*
This message and any attachments may contain confidential and/or privileged
information for the sole use of the intended recipient. Any review or
distribution by anyone other than the person for whom it was originally
intended is strictly prohibited. If you have received this message in
error, please contact the sender and delete all copies. Opinions,
conclusions or other information contained in this message may not be that
of the organization.
|
|
OK, I've opened #277, which extends this PR to support reproducible outputs, variations-of-variations, and img2img. |
I used ImageMagick because I have binaries installed, very easy it can make a gif from images contained in a folder, ran it with I am pretty sure Pillow should be able to do it as well in python. |
Yes this is a problem- you can recreate only them only with the original seed and the seed in the variant images and the -v amount used, like this:
doing the above would re-create what you saw in the image with seed 777777777 in the filename, which was output originally from: But from here there is not a way to make variations on that one, aside from adjusting -v to blend between the 2 seeds. I have some ideas for how to manage doing this (variants on variants...) but nothing I've started working on yet. |
I have already implemented it. By saving tensor files next to image files. Code is here: |
@blessedcoolant thanks for the awesome investigation, unfortunately I can't reproduce the issue here either, I tested all samplers to be sure. I know that the 2 ancestral samplers will not do "small changes" and exhibit the behavior similar to how they rapidly change output between each step. But doing this test on all samplers shows that the images match as far as setting -v to 1.0 versus using -V as the main -S seed I would really like to have the interpolation thing, but don't want to make it a special-case for -v -V stuff, it seems like a more general purpose interpolation solution would be better. So I'd prefer to not add it into this PR. For example being able to just do |
|
@xraxra |
|
@jnpatrick99 from my testing locally, it works with every sampler except That said, it does still give you something closer to the input than you'd get from a random seed. Here's an image and two variations generated with But yeah I think this will work less well with original
variation 1
|
|
I'm closing this PR, please use #277 it solves the multiple seed history issue |















adds 2 parameters for generating variations of a seed:
-z optional 0-1 value to slerp from -S noise to random noise (allows variations on an image)
-Z optional target seed that -S noise is slerped to (interpolate one image to another)
based on https://github.com/bakkot/stable-diffusion/tree/noise
this is an updated version of #81
tried to keep it as simple as possible, this is a really powerful feature IMO