Feature/large grid by matthiaszilk · Pull Request #25 · NanoComp/meep

matthiaszilk · 2015-11-29T19:45:51Z

This is a quite invasive commit to enable very large grids with more than 2^31-1 grid points. Basically the changes consist of setting the type of meep::grid_volume::the_ntot and all related variables to ptrdiff_t (via a typedef) and tracking down all errors, conversion warnings and loop variables (for details see the commit message of 521cb13).
So far the changes have been tested only on x86-64 with gcc 4.8 and 4.9. I ran some simulations with very large grids (>2^31-1 grid points) for a dispersive gold grating (x and y periodic, z pmls) and a dipole emitter in a large box terminated with pmls. The obtained resuts were consistent with smaller grid sizes and unchanged versions of meep. I can provide both the results and the ctl files if you are interested. If there are any further tests you would like me to perform please let me know.

Changed the default integer type for grid sizes from int to a 64bit signed integer type on 64bit systems. This enables using large grids with more than 2^31 grid points. For maintainability a new typedef meep::integer is introduced in meep/meeptyes.hpp and is set to ptrdiff_t. This will ensure the proper sizes on both 32bit and 64bit systems. The changes were made according to the following procedure: - the type of meep::grid_volume::the_ntot was updated to meep::integer - evrything was recompiled with -Wconversion under gcc4.8.2 on a 64bit system - iteratively all places with an implicit conversions from meep::integer to a smaller interger type where updated until all warnings disappeared - loop variables were updated from int to meep::integer when a comparison against meep::integer was involved in the iteration - all casts to int were checked and corrected where necessary - additional casts were introduced to silence conversion warnings fron or to double - print format specifiers were chenge from %d to %ld with additional casts to long int to avoid warnings on 32bit systems - MPI communication wrappers were updated to accept meep::integer as message size. As MPI specifies only int size messages (and the messages hopefully will not exceed this limit), internally the size is just downcasted to int. However, checks of the message size were added before downcasting to abort when the message size does not allow a safe cast. This at least ensures a proper error message when the supported message size is exceeded.

stevengj · 2015-12-02T02:15:42Z

src/bands.cpp

Why is this cast needed?

Those casts were inserted to silence compiler warnings. Converting from a meep::interger to a double is a narrowing operation on a 64 bit system (the double mantissa has only 52 bits). To find all potential problems with the new 64 bit int type I had to use the gcc option -Wconversion that warns when an implicit conversion may alter the value. The meep::integer -> double conversions gave hundreds of false positives which made it hard to find the real problems. So I decided to insert static_cast<double> to make the conversion explicit. It turns off the compiler warning and documents that the conversion is actually intented. It looks a bit ugly but it does not cause any overhead (the conversion has to happen anyway, no matter if it is implicit or explicit). So I would prefer to leave the casts.

stevengj · 2015-12-02T02:19:09Z

A lot of the type conversions here seem superfluous, e.g. the calls to static_cast<double>.

matthiaszilk · 2015-12-02T11:25:00Z

The calls to static_cast<double> are required to suppress compiler warnings when -Wconversion is activated. They tell the compiler explicitly that the narrowing conversion from meep::integer (a 64-bit int on 64-bit systems) to double (which can hold only 52 bits in the mantissa) is intended. They should not impose any runtime overhead.

Some of the promotions from int to meep::integer might be indeed superfluous. But I decided to be conservative and promoted all variables that came into contact with a meep::integer. This might waste a little bit of stack space in some places but I don't consider this a severe problem. With some careful review one could figure out those spurious promotions but at the moment I don't see any benefit in doing this.

stevengj · 2015-12-02T20:50:45Z

Fair enough.

stevengj · 2015-12-02T20:56:19Z

src/mympi.cpp

inconsistent indentation here.

Since you have this check several times, it would be better to write a function:

static void check_size(meep::integer size) { if (size > INT32_MAX) abort("MPI message size exceeds maximum"); }

and then just call check_size.

Sounds good. I include the check into the cast operation by replacing static_cast<int>(size) with a separate function cast_message_size(size).

stevengj · 2015-12-04T04:45:27Z

src/mympi.cpp

Seems like there should be a check somewhere that sizeof(long int) == sizeof(meep::integer).

As long as sizeof(long int) >= sizeof(meep::integer) everything is fine because the data of pt is copied. However, on systems with LLP64 data model (are there any except MS Windows?) there will be trouble (maybe also in other places?).
One way to resolve the issue would be to use long long int instead of long int.
Another way would be to use meep::integer and define a suiteable MPI data type, e.g. MPI_MEEP_INTEGER, that points either to MPI_INT, MPI_LONG or MPI_LONG_LONG_INT, depending on the size of meep::integer. Then it would be possible to write

ivec max_to_all(const ivec &pt) { meep::integer in[5], out[5]; for (int i=0; i<5; ++i) in[i] = out[i] = pt.in_direction(direction(i)); #ifdef HAVE_MPI MPI_Allreduce(&in,&out,5,MPI_MEEP_INTEGER,MPI_MAX,mycomm); #endif ivec ptout(pt.dim); for (int i=0; i<5; ++i) ptout.set_direction(direction(i), out[i]); return ptout; }

I will try this and see how far I can get with AC_CHECK_SIZEOF and the preprocessor.

- added a define MPI_MEEP_INTEGER that maps to the appropriate MPI type - added mpi communication wrappers and auxiliary functions for long long int (if the type is supported by the compiler)

matthiaszilk · 2015-12-04T14:18:14Z

I added precautions for the case sizeof(long int) < sizeof(meep::integer). Please have a look.

stevengj · 2015-12-04T23:48:46Z

Looking good. I assume you've tried make check? (It would be nice to set up Travis testing at some point.)

stevengj · 2015-12-04T23:50:39Z

LIBS="$BLAS_LIBS $LAPACK_LIBS $BLAS_LIBS ... should be safe enough, if slightly ugly. But putting it in a separate PR would be good in any case.

stevengj · 2015-12-04T23:52:11Z

~~Shouldn't the order be -lblas -llapack -lgsl? I thought they have to be sorted in dependency order (i.e. if A depends on B, then A should come after B on the link line).~~ No, I think -lgsl -llapack -lblas is right, they should be in reverse dependency order.

Could it be that your LAPACK library is providing its own BLAS? (Sometimes liblapack bundles a BLAS, or vice versa.)

This reverts commit a034638.

matthiaszilk · 2015-12-05T17:12:54Z

No, my LAPACK lib is not providing its own BLAS. I checked this. I suspect it is some issue with libgsl that is specific to Ubuntu. I think I will try another OS or compile libgsl from sources to see if the bug persists.

matthiaszilk · 2015-12-05T18:10:32Z

Yes, I ran make check. I always do before pushing any changes. All tests are passed (tested with mpi and mpb).
Today I also set up a 32-bit VM with (Ubuntu 14.04.3, gcc 4.8.2) to make sure everything is fine there, too. Unfortunately on the 32-bit system things do not look as good. Two test are failling: Symmetry and three_d. I think this is not due to the changes of the current PR. On master the tests fail as well. Here are the logs for both branches:
test-suite_feature_large_grid.txt
test-suite_master.txt
The symmetry test is passed when run serial or when eps_compare in symmetry.cpp is increased from 1.0e-9 to 2.0e-9. Might be just an accuracy issue. The three_d test also fails when run serial. Slightly increasing the tolerance does not help either. Do you have any suggestions how to proceed?

matthiaszilk · 2015-12-05T20:05:56Z

Reducing the optimization level from -O3 to -O2 solved the problem. Now all test are passed also under 32-bit.
After palying with some optimization options I'm quite sure that -ftree-vectorize causes the tests to fail.

stevengj · 2015-12-22T17:24:56Z

Was there a measurable performance penalty to turning off auto-vectorization? (Probably this flag only helps in step_generic.cpp and step_generic_stride1.cpp, so we could potentially use it only there.) Perhaps this is a gcc bug? What gcc version do you have?

matthiaszilk · 2016-03-10T19:24:47Z

I tested mainly with gcc 4.8 and 4.9 and to some minor extent with gcc 5. All versions show the same behavior. The test failures occur when sse2 or avx are activitated either by using -msse2, -mavx and so on, or by setting an appropriate architecture via the -march option. I tested several configurations for the -march flag and can provide the results if you are interested. I did only little benchmarking based on the test suite (total runtime, time stepping time im near2far). On the virtual machine there is no performance penalty when vectorization is turned off. Maybe the speed is limited by the memory bandwidth. However, some testing on a real machine should be perforemed to be really sure. Unfortunately I can do "real" benchmarks only on 64 bit machines at the moment.

stevengj · 2018-02-02T04:17:22Z

#193 is a more minimal version of this change that hopefully addresses the needs for large grids.

matthiaszilk added 2 commits November 29, 2015 18:06

fixed missing libgsl bug on Ubuntu 14.04 in configure.ac

a034638

matthiaszilk mentioned this pull request Nov 29, 2015

Merge changes from https://github.com/matthiaszilk/meep/ #24

Open

stevengj reviewed Dec 2, 2015
View reviewed changes

include MPI message size check into cast operation

fa374e1

stevengj reviewed Dec 4, 2015
View reviewed changes

added precautions for the case that meep::interger maps to long long int

cb217ea

- added a define MPI_MEEP_INTEGER that maps to the appropriate MPI type - added mpi communication wrappers and auxiliary functions for long long int (if the type is supported by the compiler)

Revert "fixed missing libgsl bug on Ubuntu 14.04 in configure.ac"

a78c569

This reverts commit a034638.

This was referenced Feb 2, 2018

Simulating Very Large Strucutres #186

Closed

64-bit support #193

Merged

stevengj closed this Feb 2, 2018

Conversation

matthiaszilk commented Nov 29, 2015

Uh oh!

stevengj Dec 2, 2015

Choose a reason for hiding this comment

Uh oh!

matthiaszilk Dec 2, 2015

Choose a reason for hiding this comment

Uh oh!

stevengj commented Dec 2, 2015

Uh oh!

matthiaszilk commented Dec 2, 2015

Uh oh!

stevengj commented Dec 2, 2015

Uh oh!

stevengj Dec 2, 2015

Choose a reason for hiding this comment

Uh oh!

stevengj Dec 2, 2015

Choose a reason for hiding this comment

Uh oh!

matthiaszilk Dec 3, 2015

Choose a reason for hiding this comment

Uh oh!

stevengj Dec 4, 2015

Choose a reason for hiding this comment

Uh oh!

matthiaszilk Dec 4, 2015

Choose a reason for hiding this comment

Uh oh!

matthiaszilk commented Dec 4, 2015

Uh oh!

stevengj commented Dec 4, 2015

Uh oh!

stevengj commented Dec 4, 2015

Uh oh!

stevengj commented Dec 4, 2015

Uh oh!

matthiaszilk commented Dec 5, 2015

Uh oh!

matthiaszilk commented Dec 5, 2015

Uh oh!

matthiaszilk commented Dec 5, 2015

Uh oh!

stevengj commented Dec 22, 2015

Uh oh!

matthiaszilk commented Mar 10, 2016

Uh oh!

stevengj commented Feb 2, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants