Optimize GFPGAN and Improve User Experience #102

blessedcoolant · 2022-08-26T10:24:29Z

This PR addresses the following issues.

GFPGAN is a post processing effect. Booting it at initialization meant that the model occupied VRAM that would otherwise be used for Stable Diffusion. This was resulting in systems having to use lower resolution on SD. As of this change, GFPGAN is no longer initialized at launch but rather just called in when a user prompts for it.
Loading GFPGAN at initialization meant booting torch modules in dream.py and then reinstance them when simplet2i.py is loaded again. As a side effect of this change, GFPGAN works off the torch modules loaded for SD and takes no performance hit in comparison to it being preloaded at initialization. Virtually no difference as far as I checked.
From a user experience perspective, the user now has absolute control on when to use GFPGAN without having to compromise VRAM by having it preloaded or having to restart the script to reinitialize if they didn't want to use GFPGAN or vice versa. Users can just pass -G along with their prompt to use GFPGAN.
Because this works on prompt basis now, you can have it on some, not have it on others. Whatever you like.

NOTE: I am not on my primary system. So I actually manually updated the code on Github. Someone double check that the formatting is correct before it is merged.

Thank you.

blessedcoolant · 2022-08-26T10:50:20Z

Additionally, I'd also look into separating both GFPGAN and Real-ESRGAN. GFPGAN is for blind face restoration and Real-ESRGAN is for upscaling. I think it'd be better for users to have the option to access these individually so they can use GFPGAN with Real-ESRGAN on images with faces and use just Real-ESRGAN for images that just need upscaling.

The script now generates the low-res images first and then performs an upscaling function. This is good because you are less likely to avoid memory crashes when there is a large batch and more iterations. This way the memory is not shared between SD and GFPGAN. Probably the best solution to this particular issue.

blessedcoolant · 2022-08-26T13:15:26Z

I've made some further improvements.

Instead of running GFPGAN while still processing SD and leading to a ton of memory issues, the app now runs GFPGAN after SD's images have been generated.

While SD is running, the app generates images and previews for the user to see their progress. When the processing ends, the app rescales all of the generated images to an upscaled version using GFPGAN and Real-ESRGAN.

This way no matter how big the batch size gets or samples get, there will likely not be any memory or CUDA crashes.

I've tested it out quite a bit. Seems to be working great. Let me know if you want any changes. @lstein

Should be good to pull now if u wish.

lstein · 2022-08-26T13:34:18Z

Hi Folks, will inspect and merge this shortly. This project has grown quite a bit so I'm going to start to do formal feature branches, bugfix branches, etc. I don't want users to get frustrated.

blessedcoolant · 2022-08-26T13:48:28Z

Hi Folks, will inspect and merge this shortly. This project has grown quite a bit so I'm going to start to do formal feature branches, bugfix branches, etc. I don't want users to get frustrated.

Thank you. Will keep an eye out for it. No issues. It's the right thing to do if this repo is going to grow larger.

Oceanswave · 2022-08-26T14:55:42Z

Unable to get this to run - each iteration says GFPGAN not initialized, it must be loaded via the --gfpgan argument when specifying the -G argument, but that option has been removed

Oceanswave · 2022-08-26T15:11:39Z

Also, why iterate through the images twice when it's already happening after each image is generated on line 321?

Oceanswave · 2022-08-26T15:17:28Z

ldm/simplet2i.py

                            image_callback(image, seed)
                    seed = self._new_seed()

+                if gfpgan_strength > 0:


This never gets run

Are you passing GFPGAN as a prompt argument?

<prompt> -G 0.5

I noticed another PR changed GFPGAN's value to null instead of 0. Maybe that's causing an overlap issue?

Wait i'll do a fresh pull and see. Coz it's working flawlessly on my side.

blessedcoolant · 2022-08-26T15:21:31Z

Unable to get this to run - each iteration says GFPGAN not initialized, it must be loaded via the --gfpgan argument when specifying the -G argument, but that option has been removed

Not sure. Its working fine on my end with the current live code. You no longer do the --gfpgan launch command. Does it say where error originates?

Also the reason I'm iterating twice is to find a solution for single and multiple iterations.

If I run GFPGAN for a single iteration in the same loop as SD, it's fine. But the cache does not get flushed if the user does multiple iterations.

And generating a completely new tensor for each and every iteration for both GFPGAN and SD was causing a memory crash for the first iteration.

So instead of doing that, I chose to save the resulting data and upscale them after SD has done its job.

I could loop just once but I did it twice ... so that I can save images while SD is working so the user can have feedback and at the same time have upscaled images when SD is done performing.

Basically a workaround to have both the memory optimization and also retain the functionality that exists right now.

Oceanswave · 2022-08-26T15:25:01Z

It doesn't report any error, just the message that it can't be initialized and it fails, I get the non gfpgan'd image in my output

blessedcoolant · 2022-08-26T15:41:16Z

It doesn't report any error, just the message that it can't be initialized and it fails, I get the non gfpgan'd image in my output

Unsure. It's hard to tell without more info. I just made a clean pull from github and merged with mine. Seems to be working fine.

Can you check if the GFPGANer class init is correctly set to self.gfpgan. That's the only reason I can think of that would stop GFPGAN from starting. Also check if gpfgan_strength is defaulted to None instead of 0 in dream.py

Oceanswave · 2022-08-26T16:07:23Z

ok, I created a new conda environment and activated that, probably a me issue from testing the installation yesterday

My time for

a man wearing a pineapple hat -n6 -S1 -G1

is now twice over as long and it outputs that it processed 12 files instead of 6:

D:\stable-diffusion\000014.1902099158.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1902099158
D:\stable-diffusion\000014.3697446297.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3697446297
D:\stable-diffusion\000014.1077098826.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1077098826
D:\stable-diffusion\000014.1212597026.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1212597026
D:\stable-diffusion\000014.3743693958.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3743693958
D:\stable-diffusion\000014.2939900369.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S2939900369
D:\stable-diffusion\000014.1902099158.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1902099158
D:\stable-diffusion\000014.3697446297.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3697446297
D:\stable-diffusion\000014.1077098826.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1077098826
D:\stable-diffusion\000014.1212597026.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1212597026
D:\stable-diffusion\000014.3743693958.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3743693958
D:\stable-diffusion\000014.2939900369.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S2939900369

So that seems to be bad, also my VRAM usage is now worse after the first run because it seems to keep it gfpgan around until the GC gets to it, so I have to wait around for it to clear.

This seems to be an extremely bad user experience. It gives me a TON of errors if I didn't set my gfpgan up correctly the first time (which I didn't since I switched conda environments).

lstein · 2022-08-26T16:09:14Z

I haven't even gotten that far. The problem I'm having with this PR is that the --gfpgan command line switch is missing from the argv parsing list. Was it intended that gfpgan be activated by specifying --gfpgan_strength when starting the script?

lstein · 2022-08-26T16:11:45Z

Ok, I think this is still a work in progress and will convert into a draft. I'm going to branch the current version (with Oceanswave second PR) into a feature branch so that people can continue to work with it.

blessedcoolant · 2022-08-26T16:15:10Z

ok, I created a new conda environment and activated that, probably a me issue from testing the installation yesterday

My time for

a man wearing a pineapple hat -n6 -S1 -G1

is now twice over as long and it outputs that it processed 12 files instead of 6:

D:\stable-diffusion\000014.1902099158.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1902099158
D:\stable-diffusion\000014.3697446297.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3697446297
D:\stable-diffusion\000014.1077098826.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1077098826
D:\stable-diffusion\000014.1212597026.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1212597026
D:\stable-diffusion\000014.3743693958.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3743693958
D:\stable-diffusion\000014.2939900369.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S2939900369
D:\stable-diffusion\000014.1902099158.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1902099158
D:\stable-diffusion\000014.3697446297.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3697446297
D:\stable-diffusion\000014.1077098826.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1077098826
D:\stable-diffusion\000014.1212597026.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S1212597026
D:\stable-diffusion\000014.3743693958.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S3743693958
D:\stable-diffusion\000014.2939900369.png: "a man wearing a pineapple hat" -s1 -b1 -W512 -H512 -C7.5 -mk_euler -G1.0 -F -S2939900369

So that seems to be bad, also my VRAM usage is now worse after the first run because it seems to keep it gfpgan around until the GC gets to it, so I have to wait around for it to clear.

This seems to be an extremely bad user experience. It gives me a TON of errors if I didn't set my gfpgan up correctly the first time (which I didn't since I switched conda environments).

The GFPGAN implementation is yours. I did nothing there. If anything, I got rid of extra user warnings that are from the obsolete GFPGAN code.

Even in your code, you were generating 12 images -- 6 from SD and 6 Upscaled. The only difference is that I am materializing them to the disk so the user has a preview.

I am not sure what issues you are facing with VRAM usage. For me the load has been far better because you are no longer running GFPGAN and SD at the same time. In your implementation,. 512x768 which easily renders on my system kept crashing because GFPGAN started to hog the memory while the SD tensors were still in loop.

The GC collection remains the same.

To be very honest, I think this is a more ideal and user friendly way to do it. Because instead of going through the hoops of setting up a bunch of starter commands, a user can just trigger and launch GFPGAN with the help of a prompt which is the most natural way to do it.

If you hate seeing the log of images being saved, it's a one line fix to do that.

Either way, I'll let @lstein decide which one he prefers. Personally I think I am staying with mine because I truly believe it consumes far less memory overload and has a much nicer way of doing upscaling through prompts.

blessedcoolant · 2022-08-26T16:16:54Z

I haven't even gotten that far. The problem I'm having with this PR is that the --gfpgan command line switch is missing from the argv parsing list. Was it intended that gfpgan be activated by specifying --gfpga

You no longer need to use --gfpgan with this method. You can use the other commands to change the locations of the GFPGAN folders but otherwise, the entire process now works directly from the prompt that depends on the value of the gfpgan_strength. It's intended.

Why have two different start up modes for the script when the base script can handle the same thing through prompts right? And I feel a user would much rather just type -G 0.7 and have the upscaling trigger rather than restart the script with an without --gfpgan.

Not to mention, the initial boot time will be much faster because you're not preloading GFPGAN. And the eventual GFPGAN load will be faster too because the torch modules are already in play by then thanks to Stable Diffusion.

lstein · 2022-08-26T16:20:03Z

Everyone, thank you all for your efforts on this. Further work on gfpgan should be done on a new branch named gfpgan-development. From the maintenance point of view, I'd like to see the majority of the gfpgan code moved into its own module located under ldm/dream/ so that it isn't cluttering the already busy dream.py and simplet2i.py modules.

Oceanswave · 2022-08-26T16:22:10Z

So that seems to be bad, also my VRAM usage is now worse after the first run because it seems to keep it gfpgan around until the GC gets to it, so I have to wait around for it to clear.
This seems to be an extremely bad user experience. It gives me a TON of errors if I didn't set my gfpgan up correctly the first time (which I didn't since I switched conda environments).

The GFPGAN implementation is yours. I did nothing there. If anything, I got rid of extra user warnings that are from the obsolete GFPGAN code.

But we talked about this, another loop has been introduced and it's running _run_gfpgan for each image twice.

Even in your code, you were generating 12 images -- 6 from SD and 6 Upscaled. The only difference is that I am materializing them to the disk so the user has a preview.

I am not sure what issues you are facing with VRAM usage. For me the load has been far better because you are no longer running GFPGAN and SD at the same time. In your implementation,. 512x768 which easily renders on my system kept crashing because GFPGAN started to hog the memory while the SD tensors were still in loop.

The GC collection remains the same.

But it's not, since every image it loads and unloads a new instance of GFPGAN

To be very honest, I think this is a more ideal and user friendly way to do it. Because instead of going through the hoops of setting up a bunch of starter commands, a user can just trigger and launch GFPGAN with the help of a prompt which is the most natural way to do it.

I'm a big fan of failing fast - if a user started with --gfpgan, we know apriori that we should test for gfpgan dependencies before failing at runtime

lstein · 2022-08-27T00:16:06Z

Slightly off topic, but is there a way to get both GPUs going during an iteration run on a multi-GPU system? That would be a major benefit for a few lucky people. I’m taking a break from code review and testing tonight, and will weigh in on the GFPGAN controversy tomorrow. Lincoln

On Fri, Aug 26, 2022 at 1:37 PM Sean McLellan ***@***.***> wrote: Agreed. The current implementation has all the images logged at break time. Either if its the regular SD version or the upscaled.. depending on what part of the process we're at. But if I say -n5 -G1, and I ^C after the 3rd, because I saw the first and I got want I wanted in terms of content, I don't have the upscaled version at that point, so the operation isn't really 'complete' at that point. on the main branch, if run and see an image on the file system it's done done. no takebacksies. There's virtually no difference for inference of GFPGAN whether we load it initially or load it during. The only difference is the memory. Having GFPGAN in memory when not being used is not worth it imo. Say I have a slow network share that my code is on, that I'm mounting. for every image it's making a call to see if gfpgan exists, loading the model (it could have changed) across the network, then getting it into VRAM. processing the image, setting gfpgan to None, which lets the GC free it on the next cycle. Probably what I'm hitting on this PR is things are getting loaded faster than the GC is able to get to them and I eventually get an OOM — Reply to this email directly, view it on GitHub <#102 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAA3EVPIU7XAQ3VKTFZ2Q23V3D6FZANCNFSM57WGQHAQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

-- Written on my cell phone. Anything that seems odd is the fault of auto-correct.

blessedcoolant · 2022-08-27T03:06:48Z

So I made some changes on my end. Here's what I settled with.

Patch 1

Both GFPGAN and Real-ESRGAN are now independent of each other. Rather than triggering both each time, users can now control them independently depending on the kind of their prompt.

To support this, I added two new prompt arguments.

-U --upscale

Supplying the -U prompt argument will trigger Real-ESRGAN Image Scaling (that was running currently with GFPGAN). This will NOT perform Face Restoration. That is the job of GFPGAN so we should let it do that.

--upscale takes two values. The first value controls the scale factor for Real-ESRGAN. I implemented the 4x version of Real-ESRGAN too. You can pass either 2 or 4 as the first value. The second value controls the intensity of the scaling. Generally these AI's really smooth out the image removing a lot of texture. With this value, you can get considerably better looking results. Ideal values of around 0.5 to 0.8.

-G --gfpgan_strength

Supplying the -G prompt argument will trigger GFPGAN. This exclusively performs face restoration. It no longer performs scaling.

--gfpgan_strength takes one value which controls the intensity of the face restoration that is being applied. Ideal values are around 0.5 to 0.8.

Users can independently call either -G or -U to either upscale or do face restoration.
Users can call both -U and -G to perform upscaling and also face restoration. When you call both, the upscaling is done first and then the face restoration. This ensures that the face restoration module has the most data to work with leading to the best results for the face quality.
-save_orig : A new prompt argument that allows you to save original images when performing upscaling or face restoration. Use this in tandem with -U or -G or both. This way you can save and compare the original SD generation with the artificially enhanced versions if need be.
Moved GFPGAN and Real_ESRGAN from the simple2ti.py to another folder - ldm/gfpgan -- Cleaned up simple2ti.py accordingly.
GFPGAN and Real-ESRGAN now log which file is being upscaled instead of the generic message so users can keep track of the progress.

Memory Issues

After trying out a ton of options, I have to once again get back to the point that my implementation of the pipeline is the most ideal one in regards to GFPGAN and Real-ESRGAN at this current point of time.

The main goal is to create beautiful images and hopefully cater to systems even with lower VRAM so the maximum number of people can use it.

With that in mind, I have to reaffirm on my feeling that to avoid any memory issues, GFPGAN and Real ESRGAN will be performed after SD has completed image generation. Performing them while SD is still in memory will inevitably crash the generation sooner than later.

I have generated over 1500 images since we last spoke with the new system while scaling all of them by 2X and performing face restoration. I have not had a single crash by OOM so far.

The big issue of a user canceling a generation to not have an upscaled result is no longer a problem. If a user cancels a generation, they will have all images generated until that point. If they like any of those images, they can just get the seed from it and repass it to the generator along with -U and -G to perform scaling and face restoration if they want to. Problem solved.

This is what I have so far. I am trying to make the whole process even better. If I come up with any more ideas, I'll post here. But if you guys agree with this, I think I might make a new PR for that in a day or two with all the changes I mentioned above and any others that I might add.

@lstein I will have a look into it. Maybe there are ways to defer them to a secondary GPU. But like you said, it'll be handy only for a lucky few .. but no reason not to implement it if we can.

blessedcoolant · 2022-08-27T18:03:07Z

@lstein @Oceanswave I've pushed the Patch 1 detailed above and all the changes that I made with it. Completed the documentation too.

I've generated 1500 images upscaled to 2x and face restored overnight. It did not crash at all on an RTX 3080 with 8GB.

The base images are inferred and ready for viewing at the same time as it is now. The upscaling is performed after image generation without it affecting usage or viewing of the results in anyway.

Overall at this current time I strongly believe there is no better way of implementing this than what is currently done. It is almost fool proof while providing great results, ease of usage and no memory compromise.

For detailed documentation on how GFPGAN and RealESRGAN work now, please read the documentation update I made in the Readme. https://github.com/blessedcoolant/stable-diffusion#gfpgan-and-real-esrgan-support

Thank you.

NOTE The outputs console shows two images being generated when stuff is either face restored or upscaled. This is technically true because two images are being generated while only 1 is being saved. I can suppress this if needed but I let it stay because the user knows they have the option to use -save_orig to now save the original output too.

lstein · 2022-08-27T20:30:55Z

Thanks for all the work on this! I am testing now after having taken a break for a family event.

blessedcoolant · 2022-08-27T20:51:18Z

Thanks for all the work on this! I am testing now after having taken a break for a family event.

No rush mate. :) Take your time. Thank you for your time and effort too.

lstein · 2022-08-27T20:56:02Z

@blessedcoolant Is there any particular reason you changed the default image height to 768?

blessedcoolant · 2022-08-27T21:00:24Z

@blessedcoolant Is there any particular reason you changed the default image height to 768?

Oh crap. That's just my defaults for my own system so I don't have to keep passing it. I forgot to reset it before I made the push. I changed it back now. Sorry about that.

lstein · 2022-08-27T21:10:46Z

@blessedcoolant I've run into another problem. For benchmarking I'm running 5 iterations with the same starting seed each time. What I see is five different images being generated during the stable diffusion pass, but then during the GFPGAN pass only the second image is processed -- five times! So I get five of the same image. The GFPGAN logging messages do refer to the correct seeds (the fixed starting seed followed by 4 randomly generated seeds), so the logic is off somewhere.

blessedcoolant · 2022-08-27T21:24:14Z

@lstein Please check now. Fixed it.

lstein · 2022-08-28T13:56:39Z

Hi folks. I've done a series of benchmarks on my 32G GPU, and summary is here: https://docs.google.com/spreadsheets/d/1YXRlzp9TT9gRQMvbzKXW0xRjUI4bllHgfVcil0kS4oI/edit?usp=sharing

Take home messages:

@Oceanswave PRs run about 10% faster than @blessedcoolant 's
Blessedcoolant's VRAM memory footprint is consistently lower than oceanswave. This was harder to measure, but ranged from 10%-20% depending on the task
When I pushed the system up to 832x1024 images, Blessedcoolant's implementation handled it (using 23.9G of VRAM), and Oceanswave consistently got CUDA OOM
I could not reproduce Issue GFPGAN executing fine for 1 image at a time, but OOM when using -n parameter, is there method to clear memory in between ? #112 on either branch. This is a report of OOM errors occuring after several -n iterations, which suggested a memory leak of some sort.
When neither upscaling nor face fixing are requested, both branches perform equivalently. However, including --gfpgan in Oceanswave's branch does increase initialization time slightly and adds about 0.5G of VRAM utilization.

I think that users who want high quality faces and upscaling would rather trade off 10% runtime increase for 10% VRAM utilisation decrease. So my inclination is to bring in @blessedcoolant 's implementation and acknowledge both contributors for contributing the code to this working. I am so impressed with the improvements one can achieve with the combo of GFPGAN and Real-ESGAN!

I'll just wait for the afternoon (~4h) for any comments on the benchmarking and conclusions before hitting the merge button.

blessedcoolant · 2022-08-28T16:12:28Z

Hi folks. I've done a series of benchmarks on my 32G GPU, and summary is here: https://docs.google.com/spreadsheets/d/1YXRlzp9TT9gRQMvbzKXW0xRjUI4bllHgfVcil0kS4oI/edit?usp=sharing

Take home messages:

@Oceanswave PRs run about 10% faster than @blessedcoolant 's

Blessedcoolant's VRAM memory footprint is consistently lower than oceanswave. This was harder to measure, but ranged from 10%-20% depending on the task

When I pushed the system up to 832x1024 images, Blessedcoolant's implementation handled it (using 23.9G of VRAM), and Oceanswave consistently got CUDA OOM

I could not reproduce Issue GFPGAN executing fine for 1 image at a time, but OOM when using -n parameter, is there method to clear memory in between ? #112 on either branch. This is a report of OOM errors occuring after several -n iterations, which suggested a memory leak of some sort.

When neither upscaling nor face fixing are requested, both branches perform equivalently. However, including --gfpgan in Oceanswave's branch does increase initialization time slightly and adds about 0.5G of VRAM utilization.

I think that users who want high quality faces and upscaling would rather trade off 10% runtime increase for 10% VRAM utilisation decrease. So my inclination is to bring in @blessedcoolant 's implementation and acknowledge both contributors for contributing the code to this working. I am so impressed with the improvements one can achieve with the combo of GFPGAN and Real-ESGAN!

I'll just wait for the afternoon (~4h) for any comments on the benchmarking and conclusions before hitting the merge button.

Thank you so much for taking the time to do this testing. Looking forward to the merge.

As for point 4 you mentioned, you will be able to repro that issue when you hit a resolution that sits on the very edge of what your system can handle. For a lot of people that is 512x512 on 6GB cards and 512x768 on 8GB cards. But either way, this PR solves it. Even with img2img.

I am currently looking at ways to actually decrease the memory footprint of SD at the very core but trying to understand that code is a bit above my head. But I'm learning. If I can figure something out, I'll make a new PR.

tildebyte · 2022-08-28T16:53:11Z

@blessedcoolant; I apologize profusely for coming into this so late, but this PR really needs to have a git rebase -i run on it. The commit history for the repo as a whole is going to be messy and confusing if this is merged as is.

OTOH, the final merge decision is, as always, up to @lstein.

blessedcoolant · 2022-08-28T17:01:45Z

@blessedcoolant; I apologize profusely for coming into this so late, but this PR really needs to have a git rebase -i run on it. The commit history for the repo as a whole is going to be messy and confusing if this is merged as is.

OTOH, the final merge decision is, as always, up to @lstein.

I agree. It had a lot of changes made. I should have kept it cleaner but I was focused on the optimization. Sorry.

If @lstein wishes to, he can rebase this and pull it. If not, I can do a fresh PR with a cleaner commit history. Either works fine with me.

You could rebase it and make a PR that I can pull too if you have the time. I just popped out and won't be back for a few hours and don't want to hold up @lstein if he wishes to merge it before that.

tildebyte · 2022-08-28T17:09:52Z

You could rebase it

I really did consider this, but I'd be starting from 0 in understanding what's going on in here, which is, eh, "sub-optimal" at best when doing aggressive rebasing 😆

blessedcoolant · 2022-08-28T17:21:18Z

You could rebase it

I really did consider this, but I'd be starting from 0 in understanding what's going on in here, which is, eh, "sub-optimal" at best when doing aggressive rebasing 😆

Here's all the list of changed files.

ldm\dream\pngwriter.py
ldm\gfpgan\gfpgan_tools.py
ldm\simplet2i.py
scripts\dream.py
scripts\preload_models.py
README.md

lstein · 2022-08-28T18:23:02Z

Yes, I'm definitely going to rebase.

lstein · 2022-08-28T19:34:17Z

@blessedcoolant I'm actually having a difficult time getting this to rebase cleanly. I wonder if you could rebase your commits and generate a clean PR against main? Thanks.

blessedcoolant · 2022-08-28T19:48:36Z

@blessedcoolant I'm actually having a difficult time getting this to rebase cleanly. I wonder if you could rebase your commits and generate a clean PR against main? Thanks.

Yep. I'm closing this PR. I'll raise a new clean one.

lstein · 2022-08-28T19:58:17Z

Thanks so much.

blessedcoolant · 2022-08-28T20:27:08Z

Picked up in #162

blessedcoolant added 3 commits August 26, 2022 22:00

Optimize GFPGAN: dream.py

ae79784

Optimize GFPGAN: simplet2i.py

106ade1

Optimize GFPGAN: Readme.MD

f03b733

blessedcoolant mentioned this pull request Aug 26, 2022

FEAT: Adds ability to integrate with GFPGAN to enhance faces and upscale images #98

Merged

shusso and others added 4 commits August 26, 2022 14:43

Move torch.device selection to it's own function

ed72ff3

Optimize Upscaling: Overwrite LowRes

e598132

Merge branch 'main' into main

74275f8

blessedcoolant added 2 commits August 27, 2022 01:19

Update README.md

d7a9586

Update simplet2i.py

9ff1400

Oceanswave reviewed Aug 26, 2022

View reviewed changes

lstein requested a review from Oceanswave August 26, 2022 16:10

lstein marked this pull request as draft August 26, 2022 16:11

blessedcoolant marked this pull request as ready for review August 26, 2022 16:15

lstein and others added 2 commits August 26, 2022 14:59

updated TODO

38ed639

Merge branch 'lstein:main' into main

638d35f

Patch 1

871a909

Height fix

9bb32da

Fix GFPGAN when Upscaling is off

5940191

blessedcoolant mentioned this pull request Aug 28, 2022

Update README.md #160

Closed

Cubox mentioned this pull request Aug 28, 2022

Fix README typo #159

Closed

blessedcoolant mentioned this pull request Aug 28, 2022

Optimize and Improve GFPGAN and Real-ESRGAN Pipeline #161

Closed

blessedcoolant mentioned this pull request Aug 28, 2022

Optimize and Improve GFPGAN and Real-ESRGAN Pipeline #162

Merged

blessedcoolant closed this Aug 28, 2022

Optimize GFPGAN and Improve User Experience #102

Optimize GFPGAN and Improve User Experience #102

Uh oh!

Conversation

blessedcoolant commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022

Uh oh!

lstein commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022

Uh oh!

Oceanswave commented Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Oceanswave commented Aug 26, 2022

Uh oh!

Oceanswave Aug 26, 2022

Choose a reason for hiding this comment

Uh oh!

blessedcoolant Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

blessedcoolant commented Aug 26, 2022

Uh oh!

Oceanswave commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Oceanswave commented Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lstein commented Aug 26, 2022

Uh oh!

lstein commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022

Uh oh!

blessedcoolant commented Aug 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lstein commented Aug 26, 2022

Uh oh!

Oceanswave commented Aug 26, 2022

Uh oh!

lstein commented Aug 27, 2022 via email

Uh oh!

blessedcoolant commented Aug 27, 2022

Patch 1

Memory Issues

Uh oh!

blessedcoolant commented Aug 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lstein commented Aug 27, 2022

Uh oh!

blessedcoolant commented Aug 27, 2022

Uh oh!

lstein commented Aug 27, 2022

Uh oh!

blessedcoolant commented Aug 27, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lstein commented Aug 27, 2022

Uh oh!

blessedcoolant commented Aug 27, 2022

Uh oh!

lstein commented Aug 28, 2022

Uh oh!

blessedcoolant commented Aug 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tildebyte commented Aug 28, 2022

Uh oh!

blessedcoolant commented Aug 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Oceanswave commented Aug 26, 2022 •

edited

Loading

blessedcoolant Aug 26, 2022 •

edited

Loading

blessedcoolant commented Aug 26, 2022 •

edited

Loading

Oceanswave commented Aug 26, 2022 •

edited

Loading

blessedcoolant commented Aug 26, 2022 •

edited

Loading

blessedcoolant commented Aug 27, 2022 •

edited

Loading

blessedcoolant commented Aug 27, 2022 •

edited

Loading

blessedcoolant commented Aug 28, 2022 •

edited

Loading

blessedcoolant commented Aug 28, 2022 •

edited

Loading