Add parameter "morphing" #86

yunsaki · 2022-08-25T13:03:28Z

First of all I want to apologise for this pull request. I really like this project and it's currently the best way to run stable diffusion locally imho. But I wanted to change some things...

What the hell I did

Well, I basically rewrote the entire dream.py and added some new things.

Why I did it

My idea was to add the ability to change parameters over time based on cmd settings, without having to copy, edit and re-enter every command. I had issues working with the original code, so I decided to rewrite it. However I do not think that my code is "better" or anything. I just did so I could find out how everything works and things everything according to my idea.

How it works

Every argument that I thought is worth modulating now takes a string that can either be <value> or <init_value>:<increment>. The type which they are interpreted is int by default and float for cfg_scale and strength. Using <value> result in a static interpretation of the value, while <init_value>:<increment> starts at the init_value which is the incremented every repetition by the increment. The repetitions are set via -r or --repeats. Supported values are: steps, seed, width, height, cfg_scale, strength. Additionally I have also added the -B or --feedback argument which passes the first image of every repetitions results to the next repetition. This however supersedes the recent addition of the -v option, which I am sorry for.

An example

a cyberpunk cityscape in the style of wadim kashin -S 10:1 -s 25:5 -C 7:0.5 -r 5 -B
The repeats are set to 5 (default 0), so 6 images will be generated. The seeds starts at 10 and is incremented by 1 every repetition. The steps start at 25 and are incremented by 5. The cfg_scale has an initial value of 7 and increases by 0.5 in every repetition. -B is enabled so the 5 images following the first one get the most recently generated image as an init_img (since n is not set to something higher than 1). This results in the following log output:

"a cyberpunk cityscape in the style of wadim kashin" -s 25:5 -b 1 -W 512 -H 512 -C 7:0.5 -r 5 -B
# outputs/img-samples/000131.10.png: "a cyberpunk cityscape in the style of wadim kashin" -s 25 -b 1 -W 512 -H 512 -C 7.0 -S 10
# outputs/img-samples/000132.11.png: "a cyberpunk cityscape in the style of wadim kashin" -s 30 -b 1 -W 512 -H 512 -C 7.5 -I outputs/img-samples/000131.10.png -f 0.75 -S 11
# outputs/img-samples/000133.12.png: "a cyberpunk cityscape in the style of wadim kashin" -s 35 -b 1 -W 512 -H 512 -C 8.0 -I outputs/img-samples/000132.11.png -f 0.75 -S 12
# outputs/img-samples/000134.13.png: "a cyberpunk cityscape in the style of wadim kashin" -s 40 -b 1 -W 512 -H 512 -C 8.5 -I outputs/img-samples/000133.12.png -f 0.75 -S 13
# outputs/img-samples/000135.14.png: "a cyberpunk cityscape in the style of wadim kashin" -s 45 -b 1 -W 512 -H 512 -C 9.0 -I outputs/img-samples/000134.13.png -f 0.75 -S 14
# outputs/img-samples/000136.15.png: "a cyberpunk cityscape in the style of wadim kashin" -s 50 -b 1 -W 512 -H 512 -C 9.5 -I outputs/img-samples/000135.14.png -f 0.75 -S 15

And the following images:

Granted, those aren't amazing, but I think with some experimentation you can do some nice things with those extra options.

What I have planned

I also want to add prompt modulation. Let's assume we have the prompt oil painting of a landscape and a list of things we want to add over time and increase the weighting of "in spring", "in summer", "in autumn", "in winter". The base prompt and the first item from the list get a set weighting of let's say 50: oil painting of a landscape:50 in spring:50. In the next step summer is added: oil painting of a landscape:50 in spring:49 in summer:1 and so on. Giving a list of prompts as arguments could also be nice. For some modification it might make more sense to adjust the ldm/simplet2i.py directly though.

What I didn't do

Extended user/usage documentation.

Thanks for reading this abomination.

…gs complicated with a 'repeat' value greater than 0

forgot to pull oops

yunsaki · 2022-08-25T13:07:22Z

Some changes I did are very arbitrary btw and I don't intend to force them.

tildebyte · 2022-08-25T14:40:58Z

Amazing ideas
From a sheer engineering/architecture standpoint, this will probably be very difficult to rebase into this repo. Have you considered adding a completely separate script with a new name? Obviously, it's up to @lstein whether or not he wants an expanding collection of generation scripts, but we do already have an incoming 'dream_web.py' - maybe you could make a 'dream_variations.py' or something?

yunsaki · 2022-08-25T14:44:50Z

@tildebyte

Thanks! I did consider this being a separate file; and actually while I worked on it it was separate as well. If that is a better solution we can go with that too!

lstein · 2022-08-25T20:00:29Z

@yunsaki I really appreciate the vision and engineering that went into this work. As you probably can tell I am a beginning python scripter (wrote my first script 2 weeks ago) and I can learn a lot from the idioms you used. However, I'm in the process of refactoring dream.py and simplet2i to make the whole system more flexible and easier to maintain, and at this point it will be very difficult for me to merge your changes into the repo. Also, as @tildebyte mentioned, there is now a dream_web.py script, and your syntax for introducing variations would be great to have there too.

So how about this? For now I can put your PR into a public branch and point people at it from the README because I think it will be very popular. Then, after I finish refactoring I will go through your code carefully and figure out how we can separate the prompt morphing code from the command-line processing and web processing code. I think there should be a module that takes a text prompt containing your variation syntax and returns a list of prompts that can be passed to the generation routines. This will preserve the basic architecture and separate the fancy bits from the UI bits. It will also support the web server well.

I also agree that this work supersedes the more limited variant generation that was brought in by an earlier PR.

Let me know what you prefer.

bakkot · 2022-08-25T20:40:07Z

Incrementing the seed will produce completely different results - unlike parameters like step count or cfg_scale, seed x and x + 1 are basically unrelated to each other. So I don't think it makes sense to try to increment seed the way you increment other parameters.

But, you can pick two seeds and then interpolate between the two of them by running the noise generation step (which is where the seed is used) and then interpolating between those two arrays. That's what #81 does.

So I think it might make sense to handle this morphing a different way: instead of specifying a base value, a step size, and a number of steps for each parameter (which is what you currently do), you could instead specify a base value, a target value, and a number of steps, and then interpolate between those values as appropriate for each parameter. For simple parameters like step count there's no meaningful difference between the two, but that design will allow you to interpolate between seeds (and prompts!) as well as simpler parameters.

yunsaki · 2022-08-25T23:57:14Z

@lstein Thank you for the nice response! Great to hear that you can learn something from this. I'm not a python expert by any means though, so maybe just keep that in mind. :)

Putting this implementation into a public branch sounds good to me, feel free to do that! I will probably try to hack on your refactored version as well in the next few days, if I find the time to do so.

Great work, keep it up!

@bakkot I agree. I basically added the seed parameter to have an easy way to control/reproduce a changing seed. Honestly though, I would argue that you could keep my simpler implementation and add yours as well. Would it do any harm to do both things?

lstein · 2022-08-26T03:41:00Z

@yunsaki I've never done a merge into a non-main branch before and I screwed it up. I'm trying to rectify it now.

TingTingin · 2022-08-26T15:30:18Z

If steps are the only thing being interpolated then the image doesn't need to be completed to be generated and can instead be generated immediately as soon as that step is completed similar to what's asked in #99

TingTingin · 2022-08-26T15:34:07Z

also a sort of prompt matrix like this

copied from https://github.com/hlky/stable-diffusion-webui
Prompt matrix
Separate multiple prompts using the | character, and the system will produce an image for every combination of them. For example, if you use a busy city street in a modern city|illustration|cinematic lighting prompt, there are four combinations possible (first part of prompt is always kept):

a busy city street in a modern city
a busy city street in a modern city, illustration
a busy city street in a modern city, cinematic lighting
a busy city street in a modern city, illustration, cinematic lighting

would be good to add too

bakkot · 2022-08-27T01:49:06Z

If steps are the only thing being interpolated then the image doesn't need to be completed to be generated and can instead be generated immediately as soon as that step is completed

I don't think that's true. Running with 50 steps vs running with 100 steps but stopping early after 50 produces different results.

yunsaki · 2022-08-27T11:38:22Z

example, if you use a busy city street in a modern city|illustration|cinematic lighting prompt, there are four combinations possible

How would you handle permutations? Because it could get out of hand pretty quickly, if you use too many words. Also what if you want to use pipes in the prompt itself? (I've seen people do that)

And how would this work with the previous modifiers? Let's take the prompt a busy street in a modern city -r 9 -C 7:1. Should it generate 4 images for every repetition? This would mean (without a more low level implementation) that the seeds wouldn't stay the same unless you set your seed right in the beginning.

Yeah, I think that could be useful. But I would prefer using an additional argument for that, like -p/--permutations or -c/--combinations (not sure if -c is already taken). Might also do another complete rewrite to make it less messy, not sure.

TingTingin · 2022-08-27T16:25:22Z

example, if you use a busy city street in a modern city|illustration|cinematic lighting prompt, there are four combinations possible

How would you handle permutations? Because it could get out of hand pretty quickly, if you use too many words. Also what if you want to use pipes in the prompt itself? (I've seen people do that)

And how would this work with the previous modifiers? Let's take the prompt a busy street in a modern city -r 9 -C 7:1. Should it generate 4 images for every repetition? This would mean (without a more low level implementation) that the seeds wouldn't stay the same unless you set your seed right in the beginning.

Yeah, I think that could be useful. But I would prefer using an additional argument for that, like -p/--permutations or -c/--combinations (not sure if -c is already taken). Might also do another complete rewrite to make it less messy, not sure.

Probably shouldn't be called permutations and combinations since laypeople might get confused by differences also I think showing a preview of exactly how many generations are going to happen would be a big help i.e something like

a busy street in | a modern city | illustration | cinematic lighting -r 9 -C 7:1 -combinations
Generating 4 images for each repetition (9) Total : 36 images

a busy street in | a modern city | illustration | cinematic lighting -r 9 -C 7:1 -permutations
Generating 24 images for each repetition (9) Total : 216 images

Obviously if people add too many words it will get out of hand but at least this will warn them before generation starts

TingTingin · 2022-08-27T16:38:16Z

I also think another feature that would be nice would be an iteration mode for -r i.e if you have busy street in a modern city -r 9 -C 7:1 -s 25:1 instead of increasing both by one per repetition it would do so iteratively i.e

busy street in a modern city -r 9 -C 7:1 -s 25:1 -it
Generating 9 images for each repetition (9) Total : 81 images

Though at this point you probably want individual repetition settings so -r could be a global setting and something like this maybe?

busy street in a modern city -r 9 -C 7:1 -s 25:1:5 -it
Generating 5 images for each repetition (9) Total : 45 images

If there's no -r specified generating can just take whatever the largest number is

busy street in a modern city -C 7:1:6 -s 25:1:3 -it
Generating 3 images for each repetition (6) Total : 18 images

a busy street in | a modern city | illustration | cinematic lighting -C 7:1:6 -s 25:1:3 -it -combinations
Generating 12 images for each repetition (6) Total : 72 images

a busy street in | a modern city | illustration | cinematic lighting -C 7:1:6 -s 25:1:3 -it -permutations
Generating 72 images for each repetition (6) Total : 432 images

hopefully my math is correct

TingTingin · 2022-08-27T18:46:51Z

If steps are the only thing being interpolated then the image doesn't need to be completed to be generated and can instead be generated immediately as soon as that step is completed

I don't think that's true. Running with 50 steps vs running with 100 steps but stopping early after 50 produces different results.

would still be good for showing in progress images if you wanted to integrate this into other program

SMUsamaShah · 2022-08-28T05:12:20Z

If steps are the only thing being interpolated then the image doesn't need to be completed to be generated and can instead be generated immediately as soon as that step is completed

I don't think that's true. Running with 50 steps vs running with 100 steps but stopping early after 50 produces different results.

Is it even possible? Can we produce an image at each step? If you know it can be done can you please point out which part of code I should be looking at? I am alien to ML stuff and terminology and have almost no idea what is going on. Recently found that k_euler_a never converges and even at 10000 steps it will produce a different image. Now I want to produce image at each step. Running a loop incrementally by increasing steps on each iteration as proposed in this PR is too much work for a simple thing.

bakkot · 2022-08-28T06:29:29Z

@SMUsamaShah It's easy for the things built in to this repo, but a little trickier for k_euler_a, which comes from the k_diffusion library. From looking at the code a little, I am guessing that by passing a callback=something parameter here (probably threaded through from here) you might be able to get a callback invoked at each step with this argument, in which x is the image data (which needs to be translated to something useful to be rendered, probably by calling _samples_to_images.

I haven't tried this but it's somewhere you can start looking.

EDIT: actually it looks like this PR is implementing that already (for the web ui); you might try that branch.

yunsaki and others added 14 commits August 25, 2022 02:50

first implementation, some things are still missing

608b3c5

add todo list

3dfadb0

don't catch KeyboardInterrupt in ldm/simplet2i, because it makes thin…

05a2e65

…gs complicated with a 'repeat' value greater than 0

remove necessity to include unused parameters in ldm/simplet2i

71e5fde

Merge branch 'main' of yun-github:yunsaki/stable-morph

834ba9a

forgot to pull oops

fix dictionary unpacking and KeyboardInterrupt handling

b4c3320

implemented logging

000c9bc

update readme

72fcda9

added readline support

b68d68c

fix formatting

26ba881

reintroduced help message

1bfa9be

Merge branch 'lstein:main' into main

fe7528f

reverted morph.py back to dream.py

0d80c92

readd shebang

aec0dc0

yunsaki added 2 commits August 25, 2022 19:00

fixed cd not terminating the current user_loop iteration

1720286

fixed variable name error

e0fb460

lstein marked this pull request as draft August 25, 2022 20:01

lstein mentioned this pull request Aug 25, 2022

ability to generate variations in the space of seeds instead of latent space #81

Closed

lstein self-assigned this Aug 25, 2022

lstein changed the base branch from main to yunsaki-morphing-dream August 26, 2022 03:31

lstein marked this pull request as ready for review August 26, 2022 03:31

lstein deleted the branch invoke-ai:yunsaki-morphing-dream August 26, 2022 03:33

lstein closed this Aug 26, 2022

lstein reopened this Aug 26, 2022

lstein merged commit cc0520a into invoke-ai:yunsaki-morphing-dream Aug 26, 2022

TingTingin mentioned this pull request Aug 26, 2022

Feature request: prompt2image option to output intermediate images #99

Closed

yunsaki mentioned this pull request Aug 28, 2022

Refactor file handling #103

Merged

Add parameter "morphing" #86

Add parameter "morphing" #86

Uh oh!

Conversation

yunsaki commented Aug 25, 2022

What the hell I did

Why I did it

How it works

An example

What I have planned

What I didn't do

Thanks for reading this abomination.

Uh oh!

yunsaki commented Aug 25, 2022

Uh oh!

tildebyte commented Aug 25, 2022

Uh oh!

yunsaki commented Aug 25, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lstein commented Aug 25, 2022

Uh oh!

bakkot commented Aug 25, 2022

Uh oh!

yunsaki commented Aug 25, 2022

Uh oh!

lstein commented Aug 26, 2022

Uh oh!

TingTingin commented Aug 26, 2022

Uh oh!

TingTingin commented Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bakkot commented Aug 27, 2022

Uh oh!

yunsaki commented Aug 27, 2022

Uh oh!

TingTingin commented Aug 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TingTingin commented Aug 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TingTingin commented Aug 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

SMUsamaShah commented Aug 28, 2022

Uh oh!

bakkot commented Aug 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

yunsaki commented Aug 25, 2022 •

edited

Loading

TingTingin commented Aug 26, 2022 •

edited

Loading

TingTingin commented Aug 27, 2022 •

edited

Loading

TingTingin commented Aug 27, 2022 •

edited

Loading

TingTingin commented Aug 27, 2022 •

edited

Loading

bakkot commented Aug 28, 2022 •

edited

Loading