Skip to content

Comments

Fix a memory allocation error on FreeBSD 10.3R#710

Closed
mguegan wants to merge 1 commit intoprometheus:masterfrom
mguegan:patch-1
Closed

Fix a memory allocation error on FreeBSD 10.3R#710
mguegan wants to merge 1 commit intoprometheus:masterfrom
mguegan:patch-1

Conversation

@mguegan
Copy link

@mguegan mguegan commented Oct 22, 2017

Under some condition[1], it is possible to hit a memory allocation failure when retrieving the buffer_bytes via sysctl().
Adding the dataType seems to fix this issue.

[1] happened on a heavy memory loaded FreeBSD 10.3R

On some condition, it is possible to hit a memory allocation failure when retrieving the `buffer_bytes` via `sysctl()`.
Adding the `dataType` seems to fix this issue.
@SuperQ
Copy link
Member

SuperQ commented Oct 22, 2017

I don't really have any systems to verify that this is compatible with other versions / architectures. But I guess I can take your word for it. :-)

@mguegan
Copy link
Author

mguegan commented Oct 22, 2017

Well.. I've only tested this patch in production on 10.3R/amd64 version and I can confirm I don't see memory alloc errors anymore. The value returned by the system seems to be a C long data type, so I think using bsdSysctlTypeUint64 is perhaps not the best, but honestly, I don't know.

@derekmarcotte
Copy link
Contributor

derekmarcotte commented Oct 23, 2017

So the issue is of course in the type conversion. I can confirm it is defined as a long data-type. I put together this patch:

derekmarcotte@38d276f

However, the SysctlRaw call is only returning 4 bytes, when C.long is for sure 8 bytes on clang/amd64, so it hits the error check. I feel like sysctl is being "smart" about the amount of data it is returning.

Can you please confirm the value of your vfs.bufspace on systems where you are having a problem?

Do you hit the error check on these systems?

I hit it when vfs.bufspace is 0 or 1261568, sysctl only returns 4 bytes to represent these.

@mguegan
Copy link
Author

mguegan commented Oct 23, 2017

Sure @derekmarcotte here is the error msg :
ERRO[5474] ERROR: meminfo collector failed after 0.000207s: couldn't get meminfo: cannot allocate memory source="node_exporter.go:95"

Collected vfs.bufspace value obtained by sysctl(8) : 2414772224 (which outside the range of 32bits value (4bytes) 2147483647)

EDIT: just saw you meant you want me to test your patch, is that correct ?

@derekmarcotte
Copy link
Contributor

Great - thanks, I'll put together another patch, and hopefully a unit test. Are you interesting in testing it when it's ready?

Forcing it to 64-bit is going to cause you grief on systems with values lower than 2^31, so I wouldn't suggest it:

https://github.com/golang/sys/blob/master/unix/syscall_bsd.go#L518

@mguegan
Copy link
Author

mguegan commented Oct 23, 2017

Sure I'm ok for testing another patch 👍

Just tested with your previous one :

ERRO[0127] ERROR: meminfo collector failed after 0.000103s: 
couldn't get meminfo: 
length of bytes received from sysctl (4) is greater than expected bytes (8)
source="collector.go:136"

the bufspace value when this error occurs :

vfs.bufspace: 1773993984

@mguegan
Copy link
Author

mguegan commented Oct 23, 2017

BTW, just reverted to uint64 and this is working with vfs.bufspace: 1738702848..
Don't know why yet.

@derekmarcotte
Copy link
Contributor

Sorry if I wasn't clear, this is expected. The next patch I post will support both.

This works okay on your machines with value greater than 2^31 though?

@mguegan
Copy link
Author

mguegan commented Oct 23, 2017

Unfortunately, I've to wait tomorrow pm for additional tests.

@derekmarcotte
Copy link
Contributor

Yup, here's the culprit:

https://github.com/freebsd/freebsd/blob/releng/10.3/sys/kern/vfs_bio.c#L338

So it'll return a long, if the data is greater than long. Forcing bsdSysctlTypeUint64 is valid on amd64, but not on i386.

Patch forthcoming. ;)

@derekmarcotte
Copy link
Contributor

This should do it:

master...derekmarcotte:dm-bufspace-long

@mguegan
Copy link
Author

mguegan commented Oct 24, 2017

Thanks @derekmarcotte ! I'll test that patch a couple of hours and tell you if anything goes wrong. Anyway, great work on that issue.

@mguegan
Copy link
Author

mguegan commented Oct 24, 2017

@derekmarcotte this working as expected. How do you want to proceed next ? closing this PR and using yours instead ?

@derekmarcotte
Copy link
Contributor

Works for me, if it works for you?

@mguegan
Copy link
Author

mguegan commented Oct 24, 2017

Closing this PR. See a better approach on #712

@mguegan mguegan closed this Oct 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants