GitHub - Maknee/Project1-CUDA-Flocking

University of Pennsylvania, CIS 565: GPU Programming and Architecture, Project 1 - Flocking

1000 boids and 10000 boids with coherence

Henry Zhu
- LinkedIn, personal website, twitter, etc.
Tested on: Windows 10 Home, Intel i7-4710HQ @ 2.50GHz 22GB, GTX 870M (Own computer)

Answer to Questions

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

If one has more boids, the CUDA has to calculate iterate through more boids and calulate neighboring nodes. This impacts performance, especially for the naive implementation, which iterates through all boids for each boid when checking for the rule. For the non-naive implemenation, this is less impactful since boids are stored in grid cells.

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

When changing the block count/block size to be more, this impacts performance by splitting up more work to the GPUs, so more kernel threads can run.

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

I did notice a performance boost, which is caused by cache hits in the GPU.

Coherent FPS

Uniform FPS

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

When changing the block count/block size to be more, this impacts performance by making the non-naive implementation has to iterate through more boids. This does not impact the naive implementation as it iterates through all the boids anyways.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
cmake		cmake
external		external
images		images
shaders		shaders
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
GNUmakefile		GNUmakefile
INSTRUCTION.md		INSTRUCTION.md
README.md		README.md
coherent_1000.gif		coherent_1000.gif
coherent_10000.gif		coherent_10000.gif
coherent_per_2000.png		coherent_per_2000.png
coherent_per_20000.png		coherent_per_20000.png
coherent_per_20000_256_block.png		coherent_per_20000_256_block.png
coherent_per_20000_no_opengl.png		coherent_per_20000_no_opengl.png
naive_per_2000.png		naive_per_2000.png
naive_per_20000.png		naive_per_20000.png
performance.png		performance.png
performance_coherent.png		performance_coherent.png
performance_uniform.png		performance_uniform.png
uniform_per_2000.png		uniform_per_2000.png
uniform_per_20000.png		uniform_per_20000.png
uniform_per_20000_256_block.png		uniform_per_20000_256_block.png
uniform_per_20000_no_opengl.png		uniform_per_20000_no_opengl.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1000 boids and 10000 boids with coherence

Answer to Questions

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Coherent FPS

Uniform FPS

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

Performance Analysis (FPS)

Performance Analysis (From NSight)

2000 boids vs 20000 boids (visualized)

Naive

Uniform

Coherent

20000 boids (non visualized)

Uniform

Coherent

128 vs 256 blocks (20000 boids)

Uniform

Coherent

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

1000 boids and 10000 boids with coherence

Answer to Questions

For each implementation, how does changing the number of boids affect performance? Why do you think this is?

For each implementation, how does changing the block count and block size affect performance? Why do you think this is?

For the coherent uniform grid: did you experience any performance improvements with the more coherent uniform grid? Was this the outcome you expected? Why or why not?

Coherent FPS

Uniform FPS

Did changing cell width and checking 27 vs 8 neighboring cells affect performance? Why or why not? Be careful: it is insufficient (and possibly incorrect) to say that 27-cell is slower simply because there are more cells to check!

Performance Analysis (FPS)

Performance Analysis (From NSight)

2000 boids vs 20000 boids (visualized)

Naive

Uniform

Coherent

20000 boids (non visualized)

Uniform

Coherent

128 vs 256 blocks (20000 boids)

Uniform

Coherent

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages