logging: EFK must avoid NFS#2599
Conversation
|
@ewolinetz @richm PTAL @adellape or @ahardin-rh this is relatively high priority per the bug. It's relevant to origin and OSE 3.1+ |
|
heck /cc @thoraxe too. |
a459a2f to
e49eab5
Compare
There was a problem hiding this comment.
Should probably remind users to stop their cluster first
There was a problem hiding this comment.
I suppose so. I figured they were gonna lose any ephemeral data anyway...
|
oi... do we forsee having a flag to do this for users with the deployer? |
|
I hadn't thought much about deployer parameters. It might be difficult to specify individual nodeselectors for each instance. But it probably wouldn't be hard to patch in the local mounts and the privileged security context. BTW, need to re-examine whether there's a way short of "privileged" that gets us past the SELinux problem with local mounts. |
There was a problem hiding this comment.
do you have to access to the privileged SCC here or will hostmount-anyuid (which does not allow privileged) be enough?
There was a problem hiding this comment.
@pweil- I tried hostmount-anyuid first and it did not have access due to SELinux context. I believe it's much the same problem we had with fluentd - openshift/origin-aggregated-logging#89 (comment)
It seems like less-than-privileged may be possible, but I'm not quite sure how and it seems like it would be a PITA for a user to set up. What do you think?
There was a problem hiding this comment.
Hey, whaddya know... openshift/origin#8504
There was a problem hiding this comment.
@pweil- I'm a little foggy on whether exactly the same fix will apply. The problem with fluentd was that it was trying to read and write in /var/log. Here we're trying to read and write in an admin-supplied storage volume; I suppose we could have them chcon the volume to whatever would be convenient? If so, what would that be - is there a label that will allow read/write for any context the pod may be running in?
There was a problem hiding this comment.
The kubelet (when a pod is using host namespaces) or docker should be performing a relabeling of the volume when it can. It uses the docker opts to pass in the selinux context that is being used. If that isn't working or this is a different use case then we can figure out what is different. cc @pmorie who is very familiar with the selinux code for volumes
There was a problem hiding this comment.
What the AVC looks like, FYI:
type=AVC msg=audit(1470323991.042:27487): avc: denied { write } for pid=9883 comm="java" name="es-storage" dev="dm-
0" ino=68862303 scontext=system_u:system_r:svirt_lxc_net_t:s0:c2,c8 tcontext=unconfined_u:object_r:usr_t:s0 tclass=dir
type=SYSCALL msg=audit(1470323991.042:27487): arch=c000003e syscall=83 success=no exit=-13 a0=7ff7d43d3780 a1=1ff a2=7
ff7d43d3780 a3=7ff7c47bd728 items=0 ppid=15669 pid=9883 auid=4294967295 uid=1000 gid=0 euid=1000 suid=1000 fsuid=1000
egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="java" exe="/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.101-3.b13.el7_2
.x86_64/jre/bin/java" subj=system_u:system_r:svirt_lxc_net_t:s0:c2,c8 key=(null)
|
Should we also warn customers to avoid gluster (which uses NFS on the backend)? |
|
Think I've addressed existing concerns... any further? |
There was a problem hiding this comment.
what is an example complication?
There was a problem hiding this comment.
The points that follow are the complications. Perhaps I should call them
something else... considerations?
On Thu, Aug 4, 2016 at 2:26 PM, Ashley Hardin notifications@github.com
wrote:
In install_config/aggregate_logging.adoc
#2599 (comment)
:@@ -416,24 +416,82 @@ The deployer creates an ephemeral deployment in which all of a pod's data is
lost upon restart. For production usage, add a persistent storage volume to each
Elasticsearch deployment configuration.-The following example specifies a volume for an Elasticsearch replica (using a
-xref:../architecture/additional_concepts/storage.adoc#persistent-volume-claims[PersistentVolumeClaim]):
+The best-performing volumes are local disks, if it is possible to use
+them. There are some complications with doing so.what is an example complication?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/openshift/openshift-docs/pull/2599/files/1784a1c09d644badf0f598cea1d6528883d92537#r73576744,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AABz-gaBc2nMpYonf1MkT7KcC4gBFRSUks5qci7PgaJpZM4JbDXF
.
|
@sosiouxme just a few minor comments from me. Thanks! |
|
@ahardin-rh think I addressed your comments now. |
|
@sosiouxme Looks good! Thanks! Just a squash and we're good to go 🍻 |
It came to our attention via https://bugzilla.redhat.com/show_bug.cgi?id=1347666 and further research ( http://mail-archives.apache.org/mod_mbox/lucene-java-user/201210.mbox/%3C01a401cda09e$17b00160$47100420$@thetaphi.de%3E and https://lucene.apache.org/core/4_8_0/core/org/apache/lucene/store/NativeFSLockFactory.html ) that NFS is a not suitable for Lucene storage. This documents how to use local storage, that NFS is not supported, and what to do if NFS is all you have.
510fef9 to
91ba11c
Compare
|
ready, then. On Thu, Aug 4, 2016 at 5:16 PM, Ashley Hardin notifications@github.com
|
|
[rev_history] |
It came to our attention via
https://bugzilla.redhat.com/show_bug.cgi?id=1347666
and further research (
http://mail-archives.apache.org/mod_mbox/lucene-java-user/201210.mbox/%3C01a401cda09e$17b00160$47100420$@thetaphi.de%3E
and
https://lucene.apache.org/core/4_8_0/core/org/apache/lucene/store/NativeFSLockFactory.html
) that NFS is a not suitable for Lucene storage. This documents how to
use local storage, that NFS is not supported, and what to do if NFS is
all you have.