Skip to content

COMP: Fix uninitialized memory Valgrind defects#3933

Merged
dzenanz merged 6 commits intoInsightSoftwareConsortium:masterfrom
jhlegarreta:FixUninitializedMemoryValgrindDefects
Mar 21, 2023
Merged

COMP: Fix uninitialized memory Valgrind defects#3933
dzenanz merged 6 commits intoInsightSoftwareConsortium:masterfrom
jhlegarreta:FixUninitializedMemoryValgrindDefects

Conversation

@jhlegarreta
Copy link
Copy Markdown
Member

@jhlegarreta jhlegarreta commented Feb 22, 2023

  • COMP: Fix itkParallelSparseFieldLevelSetImageFilter uninitialized memory
  • COMP: Fix itk::Histogram uninitialized memory
  • STYLE: Remove an unnecessary static_cast from Histogram::PrintSelf
  • STYLE: Prefer in-class {} member initializers
  • STYLE: Make PrintSelf implementation style consistent
  • STYLE: Remove duplicate statement in itk::Histogram::Initialize

PR Checklist

@github-actions github-actions Bot added type:Compiler Compiler support or related warnings area:Segmentation Issues affecting the Segmentation module area:Numerics Issues affecting the Numerics module labels Feb 22, 2023
@jhlegarreta
Copy link
Copy Markdown
Member Author

Not sure the second commit actually fixes the relevant issues.

Comment thread Modules/Numerics/Statistics/include/itkHistogram.hxx
@N-Dekker
Copy link
Copy Markdown
Contributor

The first commit "COMP: Fix itkParallelSparseFieldLevelSetImageFilter uninitialized memory" certainly looks fine to me 👍

Apparently my previous bulk commits (PR #3851, #3913, #3917) don't address all unitialized data member problems. That is because they only address classes that have virtual member functions or a itkNewMacro, itkSimpleNewMacro, or itkCreateAnotherMacro macro call. ParallelSparseFieldCityBlockNeighborList appears to have neither of them!

@jhlegarreta jhlegarreta force-pushed the FixUninitializedMemoryValgrindDefects branch from 22954da to 1e56f7c Compare February 22, 2023 23:40
@issakomi
Copy link
Copy Markdown
Member

issakomi commented Feb 23, 2023

A size of the array m_TempMeasurementVector is set in Initialize function at the line 199 in itkHistogram.hxx, this line is the origin of the uninitialized value(s). Probably m_TempMeasurementVector is sometimes not used after, i guess, don't know exactly.

  this->m_TempMeasurementVector.SetSize(this->GetMeasurementVectorSize());

The {} in .h file seems to have no effect. Values could be initialized with e.g. (after the line mentioned above).

  for (size_t x = 0; x < m_TempMeasurementVector.size(); ++x)
  {
    m_TempMeasurementVector[x] = 0;
  }

It seems to fix the defect, but, honestly, i don't know is it a good idea or not. Just FYI. Not a suggestion.

BTW, another thing, s. Histogram.hxx
this->m_TempIndex.SetSize(this->GetMeasurementVectorSize());
is done two times in the same function, s. line 179 and 198.

@issakomi
Copy link
Copy Markdown
Member

Private variables m_Pad1 and m_Pad2 (char[128]) seem to be unused. Other {}-d variables are initialized in the constructor to different values.

@N-Dekker
Copy link
Copy Markdown
Contributor

for (size_t x = 0; x < m_TempMeasurementVector.size(); ++x)
{
m_TempMeasurementVector[x] = 0;
}

FWIW, m_TempMeasurementVector seems to be an itk::Array, so m_TempMeasurementVector.Fill(0) could also work. Anyway, good catch @issakomi !

@N-Dekker
Copy link
Copy Markdown
Contributor

N-Dekker commented Feb 23, 2023

Private variables m_Pad1 and m_Pad2 (char[128]) seem to be unused.

If they are indeed unnecessary, please remove those m_Pad... members!


Update: I guess now that those "pad" members are there for some clever optimization purposes. To align the other data members efficiently, to avoid "false sharing". Honestly I do not know, as it's not my code. But does not seem useful to me to print them anyway. @jhlegarreta Do you really need the bits of m_Pad1 and m_Pad2 to be printed? Looking at PR #3872 commit c47ed1c

os << indent << "m_Pad1: " << m_Pad1 << std::endl;
os << indent << "Size: " << m_Size << std::endl;
os << indent << "Radius: " << m_Radius << std::endl;
os << indent << "ArrayIndex: " << m_ArrayIndex << std::endl;
os << indent << "NeighborhoodOffset: " << m_NeighborhoodOffset << std::endl;
os << indent << "StrideTable: " << m_StrideTable << std::endl;
os << indent << "Pad2: " << m_Pad2 << std::endl;

@jhlegarreta
Copy link
Copy Markdown
Member Author

Thanks for investigating the valgrind defects with itk::Histogram @issakomi 💯. I will try to review your comments and see if I can come up with a patch set as time permits.

Private variables m_Pad1 and m_Pad2 (char[128]) seem to be unused.

Thanks. Will have a look at them and remove if appropriate.

Other {}-d variables are initialized in the constructor to different values.

Will see if I have the time to look if the initialization can be done in the header file.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 23, 2023

to avoid "false sharing"

This is the more likely reason. Not printing the pad members sounds like the right thing. And then they might not need to be initialized?

I did only a glancing review initially. It is good that other people review as well.

@jhlegarreta
Copy link
Copy Markdown
Member Author

. @jhlegarreta Do you really need the bits of m_Pad1 and m_Pad2 to be printed? Looking at PR #3872 commit c47ed1c

I'd say that it is not a matter of preference: if the person that designed the class thought/required a given variable to be a member, then the ITK standard is to have it printed as any other member (unless static). Note that I did not read the body of the methods to determine in which ways the variable is used when addressing the defect/when printing the variable.

The standard could be improved if we come up with a rationale as to why a given variable should not be printed, and a way for automatically marking such variables for the purpose you describe Niels (a no-op macro if that makes sense/serves the purpose).

In this case, and until we propose something, I can comment both lines to serve as a reminder, or else I can remove them.

And then they might not need to be initialized?

I would keep the initialization to avoid issues. But as you wish.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Feb 23, 2023

Initializing them seems safer, even if not necessary.

Fix uninitialized memory warning in
`itk::ParallelSparseFieldLevelSetImageFilter`: add an in-class `{}`
default member initializer to its non-static `m_Pad1` and `m_Pad2` data
members.

Fixes:
```
UMC ==17034== Conditional jump or move depends on uninitialised value(s)
==17034==    at 0x483EF49: strlen (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==17034==    by 0x6561BCD: std::basic_ostream >& std::operator<<  >(std::basic_ostream >&, char const*) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.28)
==17034==    by 0x3B867E: itk::ParallelSparseFieldCityBlockNeighborList, itk::ZeroFluxNeumannBoundaryCondition, itk::Image > > >::Print(std::ostream&, itk::Indent) const (itkParallelSparseFieldLevelSetImageFilter.hxx:86)
==17034==    by 0x3B49D9: itk::ParallelSparseFieldLevelSetImageFilter, itk::Image >::PrintSelf(std::ostream&, itk::Indent) const (itkParallelSparseFieldLevelSetImageFilter.hxx:2516)
==17034==    by 0x5429E53: itk::LightObject::Print(std::ostream&, itk::Indent) const (itkLightObject.cxx:119)
==17034==    by 0x3B1D3F: itk::SmartPointer::Print(std::ostream&) const (itkSmartPointer.h:172)
==17034==    by 0x3AF11D: std::ostream& itk::operator<< (std::ostream&, itk::SmartPointer) (itkSmartPointer.h:297)
==17034==    by 0x3AD29E: itkParallelSparseFieldLevelSetImageFilterTest(int, char**) (itkParallelSparseFieldLevelSetImageFilterTest.cxx:318)
==17034==    by 0x211CA7: main (ITKLevelSetsTestDriver.cxx:272)
```

Raised for example at:
https://open.cdash.org/viewDynamicAnalysisFile.php?id=10071614

Related to commit 5e2c49f.

Uncovered by commit c47ed1c.
Fix uninitialized memory warning in `itk::Histogram`:
`m_TempMeasurementVector` is an `itk::Array` so rely on its prining
capabilities without using the `itk::NumericTraits<>::PrintType`
casting.

Fixes:
```
UMC ==17753== Conditional jump or move depends on uninitialised value(s)
==17753==    at 0x6ED4867: double_conversion::DoubleToStringConverter::ToShortestIeeeNumber(double, double_conversion::StringBuilder*, double_conversion::DoubleToStringConverter::DtoaMode) const (double-to-string.cc:172)
==17753==    by 0x55A1556: double_conversion::DoubleToStringConverter::ToShortest(double, double_conversion::StringBuilder*) const (double-to-string.h:216)
==17753==    by 0x55A0C60: (anonymous namespace)::ConvertToShortest(double_conversion::DoubleToStringConverter const&, double, double_conversion::StringBuilder&) (itkNumberToString.cxx:31)
==17753==    by 0x55A0DB5: std::__cxx11::basic_string, std::allocator > (anonymous namespace)::FloatingPointNumberToString(double) (itkNumberToString.cxx:55)
==17753==    by 0x55A0CD4: itk::NumberToString::operator()[abi:cxx11](double) const (itkNumberToString.cxx:71)
==17753==    by 0x55A0B3A: std::__cxx11::basic_string, std::allocator > itk::ConvertNumberToString(double) (itkNumberToString.h:86)
==17753==    by 0x55A0507: std::ostream& itk::operator<< (std::ostream&, itk::Array const&) (itkArrayOutputSpecialization.cxx:34)
==17753==    by 0x63FC43: itk::Statistics::Histogram::PrintSelf(std::ostream&, itk::Indent) const (itkHistogram.hxx:705)
==17753==    by 0x555EE53: itk::LightObject::Print(std::ostream&, itk::Indent) const (itkLightObject.cxx:119)
==17753==    by 0x66F568: itk::CompareHistogramImageToImageMetric, itk::Image >::PrintSelf(std::ostream&, itk::Indent) const (itkCompareHistogramImageToImageMetric.hxx:158)
==17753==    by 0x66DB44: itk::KullbackLeiblerCompareHistogramImageToImageMetric, itk::Image >::PrintSelf(std::ostream&, itk::Indent) const (itkKullbackLeiblerCompareHistogramImageToImageMetric.hxx:99)
==17753==    by 0x555EE53: itk::LightObject::Print(std::ostream&, itk::Indent) const (itkLightObject.cxx:119)
==17753==    by 0x663AF1: itkKullbackLeiblerCompareHistogramImageToImageMetricTest(int, char**) (itkKullbackLeiblerCompareHistogramImageToImageMetricTest.cxx:303)
==17753==    by 0x2CF347: main (ITKRegistrationCommonTestDriver.cxx:417)
```

and
```
UMC ==19839== Conditional jump or move depends on uninitialised value(s)
==19839==    at 0x6C8D867: double_conversion::DoubleToStringConverter::ToShortestIeeeNumber(double, double_conversion::StringBuilder*, double_conversion::DoubleToStringConverter::DtoaMode) const (double-to-string.cc:172)
==19839==    by 0x54CA58F: double_conversion::DoubleToStringConverter::ToShortestSingle(float, double_conversion::StringBuilder*) const (double-to-string.h:221)
==19839==    by 0x54C9C95: (anonymous namespace)::ConvertToShortest(double_conversion::DoubleToStringConverter const&, float, double_conversion::StringBuilder&) (itkNumberToString.cxx:40)
==19839==    by 0x54CA0DE: std::__cxx11::basic_string, std::allocator > (anonymous namespace)::FloatingPointNumberToString(float) (itkNumberToString.cxx:55)
==19839==    by 0x54C9D2A: itk::NumberToString::operator()[abi:cxx11](float) const (itkNumberToString.cxx:78)
==19839==    by 0x54C9BB8: std::__cxx11::basic_string, std::allocator > itk::ConvertNumberToString(float) (itkNumberToString.h:86)
==19839==    by 0x54C96DD: std::ostream& itk::operator<< (std::ostream&, itk::Array const&) (itkArrayOutputSpecialization.cxx:54)
==19839==    by 0x76AFD3: itk::Statistics::Histogram::PrintSelf(std::ostream&, itk::Indent) const (itkHistogram.hxx:705)
==19839==    by 0x5487E53: itk::LightObject::Print(std::ostream&, itk::Indent) const (itkLightObject.cxx:119)
==19839==    by 0x7257CF: itk::SmartPointer const>::Print(std::ostream&) const (itkSmartPointer.h:172)
==19839==    by 0x711956: std::ostream& itk::operator<<  const>(std::ostream&, itk::SmartPointer const>) (itkSmartPointer.h:297)
==19839==    by 0x711173: void PrintHistogramInfo const> >(itk::SmartPointer const>) (itkHistogramMatchingImageFilterTest.cxx:126)
==19839==    by 0x6F62DB: int itkHistogramMatchingImageFilterTest() (itkHistogramMatchingImageFilterTest.cxx:272)
==19839==    by 0x6F2B1D: itkHistogramMatchingImageFilterTest(int, char**) (itkHistogramMatchingImageFilterTest.cxx:412)
==19839==    by 0x4735D7: main (ITKImageIntensityTestDriver.cxx:532)
```

Raised for example at:
https://open.cdash.org/viewDynamicAnalysisFile.php?id=10071615
https://open.cdash.org/viewDynamicAnalysisFile.php?id=10071616

Uncovered by commit c47ed1c.
Remove an unnecessary `static_cast` from `itk::Histogram::PrintSelf`:
`m_TempMeasurementVector` is an `itk::Array` so rely on its prining
capabilities without using the `itk::NumericTraits<>::PrintType`
casting.
@jhlegarreta jhlegarreta force-pushed the FixUninitializedMemoryValgrindDefects branch 2 times, most recently from 9142ed4 to 3d08e78 Compare February 24, 2023 03:09
@jhlegarreta
Copy link
Copy Markdown
Member Author

A few notes:

So maybe the variables were actually intended to be used. So for now, until we come up with more complete idea, I'll have them printed.

@issakomi @N-Dekker @dzenanz thanks for all the pointers.

@issakomi
Copy link
Copy Markdown
Member

issakomi commented Feb 24, 2023

FYI, this padding is there since the first 'parallel' version of the filter (2003). S. here

  struct ThreadData
  {
    char pad1 [128];
...
    char pad2 [128];
  };
  
  /** An array storing the individual structures for each thread. */
  ThreadData *m_Data;

And here

Prefer using in-class `{}` member initializers in
`itk::ParallelSparseFieldLevelSetImageFilter` for its non-static data
members.

Related to commit 5e2c49f.
Make `PrintSelf`implementation style consistent following the ITK SWG
coding style guideline:
- Use the `os << indent << "{ivarName}: " << m_ivarName << std::endl`
  recipe: do not print the leading `m_` of the ivar.

Following commit c47ed1c.
@jhlegarreta jhlegarreta force-pushed the FixUninitializedMemoryValgrindDefects branch from 3d08e78 to 53d8dbe Compare February 25, 2023 00:29
RadiusType m_Radius;
std::vector<unsigned int> m_ArrayIndex;
std::vector<OffsetType> m_NeighborhoodOffset;
char m_Pad1[128]{};
Copy link
Copy Markdown
Contributor

@N-Dekker N-Dekker Feb 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhlegarreta @dzenanz Please wait, before merging. I'm still considering to propose adjusting the ITK Coding Style Guide, to exclude padding bytes from {} initialization.

Update: please check: InsightSoftwareConsortium/ITKSoftwareGuide#192

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could make it clearer that these are just padding bytes that do not need to be initialized by renaming m_Pad1 and m_Pad2 to something like m_UninitializedPaddingBytes1 and m_UninitializedPaddingBytes2...? We could also possibly make it clearer by declaring them as arrays of byte, instead of arrays of char. C++17 defines an std::byte type that we can easily borrow: https://en.cppreference.com/w/cpp/types/byte

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@N-Dekker I agree with our last comment here. the m_Pad1 is not really a state variable of the class. It is a solution to a problem that happens to use a char array to enforce performance gains, but does not represent the state of the class.

These pad variables should have good documentation that they are excluded from the "standard rules" applied to member variables that represent the state of the object.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hjmjohnson By the way, it looks like C++17 offers some extra functionality to prevent false sharing. From https://en.cppreference.com/w/cpp/thread/hardware_destructive_interference_size:

Minimum offset between two objects to avoid false sharing.

struct keep_apart {
  alignas(std::hardware_destructive_interference_size) std::atomic<int> cat;
  alignas(std::hardware_destructive_interference_size) std::atomic<int> dog;
};

This would prevent false sharing between cat and dog, without having to declare those padding bytes explicitly. hardware_destructive_interference_size is typically 64 or 128, so similar to our m_Pad1 and m_Pad2 size.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When did we move to C++14? Maybe it is time to move to C++17?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was almost 2 years ago. I suggest making a forum topic about move to C++17, to see what others think about it. Perhaps describe implications on compiler support, and other similar considerations in the post.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has other improvements which are not contentious. I will merge this. @N-Dekker can then make a follow-up PR which just deals with Padding bytes.

@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Mar 20, 2023

@jhlegarreta If this PR is ready for merging, please merge it. If not, please turn it into a draft.

@jhlegarreta
Copy link
Copy Markdown
Member Author

@dzenanz This is ready to be merged on my side; @N-Dekker was reluctant to initialize the m_Pad1 and m_Pad2 ivars, and @hjmjohnson also shared that. Niels assumed that those variables intended to avoid false sharing; we do not have any better hypothesis. I ignore if a steady decision has been made; if yes, it should be put in the ITK SW Guide. I am mostly away from keyboard these days, but I would very much like to have this merged as it will solve the Valgrind defects that are present since a month; changes can then be made in a separate PR (also to avoid needing to wait for the C++17 transition if that will affect how these variables are declared and printing the variables in the PrintSelf mehod), still ensuring that no new Valgrind defects are present.

@N-Dekker
Copy link
Copy Markdown
Contributor

N-Dekker commented Mar 21, 2023

I would very much like to have this merged as it will solve the Valgrind defects that are present since a month

Thanks for you analysis, @jhlegarreta Just to recall (for those who did not remember), the undefined behavior appears caused by printing those padding bytes (while they were uninitialized):

os << indent << "m_Pad1: " << m_Pad1 << std::endl;
os << indent << "Size: " << m_Size << std::endl;
os << indent << "Radius: " << m_Radius << std::endl;
os << indent << "ArrayIndex: " << m_ArrayIndex << std::endl;
os << indent << "NeighborhoodOffset: " << m_NeighborhoodOffset << std::endl;
os << indent << "StrideTable: " << m_StrideTable << std::endl;
os << indent << "Pad2: " << m_Pad2 << std::endl;

I'm still in favor of simply removing those print statements. Your PR appears to assume that m_Pad1 and m_Pad2 are zero-terminated character strings. But they are not. They are just raw bytes. Even after adding {} initializers, printing them does not yield any useful information. I mean, os << m_Pad1 always just prints an empty string, when m_Pad1 is {}-initialized.

@jhlegarreta
Copy link
Copy Markdown
Member Author

Your PR appears to assume that m_Pad1 and m_Pad2 are zero-terminated character strings

My PR does no assumption on their utility and follows the current ITK SW Guide guidelines to print all non-static ivars.

I mean, os << m_Pad1 always just prints an empty string,

Simply because they are not used elsewhere, and we are only guessing about their utility.

@N-Dekker
Copy link
Copy Markdown
Contributor

Just to explain what I mean by saying that PR (and specifically the print statements) appears to assume that m_Pad1 and m_Pad2 are always zero-terminated. If these arrays were filled with (only) non-zero values, printing them by os << m_Pad1 and os << m_Pad2 would still cause undefined behavior.

Example:

char arr[3] = { '1', '2', '3' };
std::cout << arr;  // Undefined behavior, arr is not zero-terminated!

@dzenanz dzenanz merged commit bbc691a into InsightSoftwareConsortium:master Mar 21, 2023
@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Mar 21, 2023

Padding bytes to be reassessed in a new PR. @N-Dekker if you do not plan to do that soon (maybe as part of C++17 PR which allows use of alignas), it would be good to create an issue so we don't forget about it. I think there are some other instances of padding bytes.

@jhlegarreta jhlegarreta deleted the FixUninitializedMemoryValgrindDefects branch March 21, 2023 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Numerics Issues affecting the Numerics module area:Segmentation Issues affecting the Segmentation module type:Compiler Compiler support or related warnings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants