chadh.github.io/atom.xml at master · chadh/chadh.github.io · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

  <title><![CDATA[My Blog]]></title>
  <link href="http://chadh.github.io/atom.xml" rel="self"/>
  <link href="http://chadh.github.io/"/>
  <updated>2013-12-18T13:26:40-05:00</updated>
  <id>http://chadh.github.io/</id>
  <author>
    <name><![CDATA[Chad Huneycutt]]></name>

  </author>
  <generator uri="http://octopress.org/">Octopress</generator>


  <entry>
    <title type="html"><![CDATA[r10k ftw?]]></title>
    <link href="http://chadh.github.io/blog/2013/12/17/r10k-ftw/"/>
    <updated>2013-12-17T19:49:00-05:00</updated>
    <id>http://chadh.github.io/blog/2013/12/17/r10k-ftw</id>
    <content type="html"><![CDATA[<p>Last time I talked about <a href="http://chadh.github.io/blog/2013/12/12/puppet-workflow-take-1/">how we have our puppet environments set up</a>, but I did not mention the tools we use to maintain it.  In this post, I&rsquo;ll cover how we currently do it, and the changes I am considering to make the process a little easier.</p>

<!-- more -->


<p>Our current workflow ideally goes something like this:</p>

<ol>
<li>Decide on a new feature</li>
<li>Create a new environment

<ol>
<li>Run the newenv.sh script, providing the base branch and the name of the new environment</li>
<li>Create feature branches for each repo that will be affected by the feature</li>
</ol>
</li>
<li>Make changes</li>
<li>Test changes by running puppet on a set of hosts in the new environment</li>
<li>Repeat the previous two steps, committing as appropriate, until done</li>
<li>Commit one final time</li>
<li>Merge changes into base branch</li>
<li>Push changes to central repo</li>
<li>Pull changes down to production environment</li>
<li>Carefully, painstakingly merge feature into other production branches</li>
</ol>


<p>That is pretty much the standard development workflow, so I won&rsquo;t say anything about the overall process.  The relevant steps for this discussion are Step 2 and the final step, which hide a lot of complexity.</p>

<p><strong>newenv.sh, puppet-librarian, and merging</strong><br/>
<code>newenv.sh</code> is a script we cooked up to automate the creation of an environment.  As I mentioned in the last post, our environments are composed of checkouts from several git repos, as well as downloads from the PuppetLabs Module Forge.  The script performs these steps:</p>

<ol>
<li>Clone the environment repo using the specified branch and name the checkout appropriately</li>
<li>Run librarian-puppet to fetch all the modules</li>
<li>Go behind lp and clean up by checking out the requested branches explicitly</li>
<li>Clone the hieradata for this base branch and graft it into the environment in the right place</li>
</ol>


<p>This brings us to <a href="https://github.com/rodjek/librarian-puppet">librarian-puppet</a>.  This is a tool that uses a description called a <code>Puppetfile</code> that describes a set of modules, where to get them, which versions to get, etc. and then manages them in your environment.  The program has the ability to do the initial checkout/download as well as keep them up to date, and remove them when they are no longer needed.  Unfortunatley, we could never really get it to work exactly right.  Specifically, git checkouts seemed to leave the code on an anonymous branch, rather than the requested branch.  I had to add some code to the newenv.sh script that would make sure the repos were all checked out on the right branch.  We were also never entirely sure how safe it was to develop in a module that was managed by librarian-puppet.</p>

<p>The other pain point is merging from one production environment to the others.  I will not discuss that in this post, but I think it really boils down to just general revision control pain.  Merging is necessarily complicated when it isn&rsquo;t easy.</p>

<p><strong>r10k</strong><br/>
So what can we do about librarian-puppet?  First I have to acknowledge that we may be doing something wrong.  I realize a lot of people have no problems with it, and if we could get the checkouts to work, could safely update without losing any changes, and there was active development, it would probably be fine for us.  But there is another option called <a href="https://github.com/adrienthebo/r10k">r10k</a>, developed by the folks at puppetlabs that I want to try out.  Rather than just managing modules, r10k manages all of the environments (as well as <a href="http://somethingsinistral.net/blog/rethinking-puppet-deployment/">some other features</a>.  This makes it rather opinionated about the development workflow.  It facilitates the process detailed <a href="https://puppetlabs.com/blog/git-workflow-and-puppet-environments">here</a>, which does not exactly match the way we do things.  Here is what it expects the puppet directory to look like:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>/etc/puppet
</span><span class='line'>|-- data
</span><span class='line'>|   |-- dev
</span><span class='line'>|   |-- feature1
</span><span class='line'>|   |-- feature2
</span><span class='line'>|   `-- master
</span><span class='line'>`-- environments
</span><span class='line'>    |-- dev
</span><span class='line'>    |-- feature1
</span><span class='line'>    |-- feature2
</span><span class='line'>    `-- master</span></code></pre></td></tr></table></div></figure>


<p>Basically r10k does two things: it checks out all branches of repos that you tell it about, and, like librarian-puppet, it reads a Puppetfile in the checked out repos and fetches the modules.  Because of the way it currently works, if you want r10k to manage the hieradata from a separate repo, it has to manage it outside of the environment (as above).  I feel pretty strongly that having the hiera data inside the environment is the <em>right</em> way, simply because then all environment-specific data is co-located.  Hiera already &ldquo;hides&rdquo; the data (vs. it being right in the code), and there is no reason to relocate it completely out of the environment.</p>

<p>r10k also does some optimizations on how git repos are managed to make repo operations fast, which are explained at the links above.</p>

<p><strong>Application</strong><br/>
So how do we improve our workflow?  I started this post with a clear answer to that question, but writing this has helped me see some problems with what I was thinking, some solutions to those problems and others, and now I will have to try a few things before I am sure what is going to work.  Stay tuned.</p>
]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[puppet workflow take 1]]></title>
    <link href="http://chadh.github.io/blog/2013/12/12/puppet-workflow-take-1/"/>
    <updated>2013-12-12T01:13:00-05:00</updated>
    <id>http://chadh.github.io/blog/2013/12/12/puppet-workflow-take-1</id>
    <content type="html"><![CDATA[<p>When it comes to puppet development workflow, my colleagues and I have tried various solutions, and we are getting pretty close to something that we believe works for us.</p>

<!-- more -->


<p>In our environment, we have three groups with slightly different requirements.  The infrastructure group is deploying to file servers, dns servers, etc., and as one might expect, changes need to be carefully vetted before being pushed out.  The research support group is the testing ground.  They are constantly deploying new hardware, trying new versions of the OS and software, and stability frequently is less important than flexibility.  The instruction support group is a hybrid of the other two.  During the semester, there needs to be very few changes in order to ensure consistency in the labs.  But during the breaks, rapid development to update the repos and pull in new fixes, as well as to support any new hardware being deployed is the rule.  The last complication is that none of these environments is necessarily a strict subset of the others.  In some cases, infrastructure and research have a very devops-y feel, with research doing development, and then infrastructure hardening and preparing the changes for production. There are also components and implementations that only one or two of the groups will use.  It is the kind of code management that git is good at, if you use it properly, and we are still working on that.</p>

<p>We use git and dynamic environments in puppet.  It is a relatively common setup, except for the way we deal with hiera, although I do not believe there is a standard for that yet.  Our directories look roughly like this:</p>

<figure class='code'><div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>/etc/puppet
</span><span class='line'>├── environments
</span><span class='line'>│   ├── [environment]
</span><span class='line'>│   │   ├── hieradata
</span><span class='line'>│   │   │   ├── modules
</span><span class='line'>│   │   │   ├── nodes
</span><span class='line'>│   │   │   └── roles
</span><span class='line'>│   │   └── modules
</span><span class='line'>│   │       ├── apache
</span><span class='line'>│   │       ├── apachepassenger
</span><span class='line'>│   │       ├── autofs
</span><span class='line'>│   │       ├── ccbp
</span><span class='line'>|   |       ...            </span></code></pre></td></tr></table></div></figure>


<p>The tree is managed with multiple git repos.  There is one for each environment (subdirectory of <code>environments</code>).  Each module either comes from its own repo or the forge.  Initially we had the entire hieradata subdirectory in a repo, but we found that our usage made it realy hard to maintain it that way.  Currently, <code>hieradata/nodes/</code> and <code>hieradata/roles</code> is part of the environment repo, and <code>hieradata/modules</code> is in a separate repo.  I will mention more about how we do hiera here later, but the latter subdirectory contains <strong>site-specific</strong> in-module hieradata overrides, and we found that it made more sense to split that data out so we could easily keep updates in sync with the module.  We manage all of this using a script that clones the environment, grafts in the hieradata, and then runs librarian-puppet, which checks out all the modules either from a git repo or the forge.</p>

<p>I will just point out a couple of things: most of the repos are shared among our three groups, with each of us managing our own branch.  The rest of the repos (e.g., the environment repo) are completely different repos.  We could have used branches in a single repo, but we do not think that the code bases will stay similar enough to warrant that.  The second point is that our hiera data is <em>in the environment</em>; other folks in the community separate the hiera data directories completely from the environment.  So rather than <code>/etc/puppet/environments/myenvironment/hieradata</code>, they would have `/etc/puppet/hieradata/myenvironment/&lsquo;.  Six one way, one half dozen the other &hellip; or is it?</p>
]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[CoreOS]]></title>
    <link href="http://chadh.github.io/blog/2013/10/18/coreos/"/>
    <updated>2013-10-18T17:46:00-04:00</updated>
    <id>http://chadh.github.io/blog/2013/10/18/coreos</id>
    <content type="html"><![CDATA[<p>I wrote about <a href="http://chadh.github.io/blog/2013/10/17/docker/">Docker</a> yesterday and teased that today I would discuss <a href="http://coreos.com">CoreOS</a>.  I came across CoreOS indirectly through a tweet about Kelsey Hightower&rsquo;s <a href="https://github.com/kelseyhightower/confd">confd</a>.  I will not go into confd, right now (have not really looked into it anyway), but it works with <a href="https://github.com/coreos/etcd">etcd</a>, which is one of features of CoreOS.</p>

<p>So what is CoreOS?</p>

<p>As its name suggests, it is a minimal OS.  It is based on the linux kernel and features <a href="http://www.freedesktop.org/wiki/Software/systemd/">systemd</a>, etcd, and docker.  It is essentially a hypervisor for containers.  There is no package manager as there is no software to manage.  Upgrades to the OS occur as atomic system updates, requiring a reboot to take effect.  I expounded at length about Docker&rsquo;s notion of containerized services, and CoreOS takes that to its logical conclusion: nothing runs directly on the host OS.</p>

<p>Whether or not CoreOS or something like it is the future of system provisioning, it provides a very easy to use platform for taking Docker for a spin.  I was able to easily bring up a vagrant box with CoreOS and start playing with Docker.  With a little more effort, Docker also works with Ubuntu 12.04 or later (with updated 3.8 kernel), but why bother with all the overhead?</p>

<p>Here&rsquo;s a session to show an Ubuntu container running inside CoreOS running inside Vagrant:</p>

<figure class='code'><figcaption><span>Vagrant/CoreOS Bash Session </span></figcaption>
<div class="highlight"><table><tr><td class="gutter"><pre class="line-numbers"><span class='line-number'>1</span>
<span class='line-number'>2</span>
<span class='line-number'>3</span>
<span class='line-number'>4</span>
<span class='line-number'>5</span>
<span class='line-number'>6</span>
<span class='line-number'>7</span>
<span class='line-number'>8</span>
<span class='line-number'>9</span>
<span class='line-number'>10</span>
<span class='line-number'>11</span>
<span class='line-number'>12</span>
<span class='line-number'>13</span>
<span class='line-number'>14</span>
<span class='line-number'>15</span>
<span class='line-number'>16</span>
<span class='line-number'>17</span>
<span class='line-number'>18</span>
<span class='line-number'>19</span>
<span class='line-number'>20</span>
<span class='line-number'>21</span>
<span class='line-number'>22</span>
<span class='line-number'>23</span>
<span class='line-number'>24</span>
</pre></td><td class='code'><pre><code class=''><span class='line'>ch145@strangepork:~/coreos-vagrant (master)$ grep DISTRIB /etc/lsb-release
</span><span class='line'>DISTRIB_ID=Ubuntu
</span><span class='line'>DISTRIB_RELEASE=12.04
</span><span class='line'>DISTRIB_CODENAME=precise
</span><span class='line'>DISTRIB_DESCRIPTION="Ubuntu 12.04.3 LTS"
</span><span class='line'>ch145@strangepork:~/coreos-vagrant (master)$ vagrant ssh
</span><span class='line'>Last login: Fri Oct 18 23:06:44 UTC 2013 from 10.0.2.2 on ssh
</span><span class='line'>   ______                ____  _____
</span><span class='line'>  / ____/___  ________  / __ \/ ___/
</span><span class='line'> / /   / __ \/ ___/ _ \/ / / /\__ \
</span><span class='line'>/ /___/ /_/ / /  /  __/ /_/ /___/ /
</span><span class='line'>\____/\____/_/   \___/\____//____/
</span><span class='line'>core@localhost ~ $
</span><span class='line'>
</span><span class='line'>core@localhost ~ $ grep DISTRIB /etc/lsb-release
</span><span class='line'>DISTRIB_ID=CoreOS
</span><span class='line'>DISTRIB_RELEASE=106.0.0
</span><span class='line'>DISTRIB_CODENAME="West Coast Style"
</span><span class='line'>DISTRIB_DESCRIPTION="CoreOS 106.0.0 (Official Build) dev-channel amd64-generic test"
</span><span class='line'>core@localhost ~ $ docker run -t -i ubuntu grep DISTRIB /etc/lsb-release
</span><span class='line'>DISTRIB_ID=Ubuntu
</span><span class='line'>DISTRIB_RELEASE=12.04
</span><span class='line'>DISTRIB_CODENAME=precise
</span><span class='line'>DISTRIB_DESCRIPTION="Ubuntu 12.04 LTS"</span></code></pre></td></tr></table></div></figure>


]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[Docker]]></title>
    <link href="http://chadh.github.io/blog/2013/10/17/docker/"/>
    <updated>2013-10-17T20:04:00-04:00</updated>
    <id>http://chadh.github.io/blog/2013/10/17/docker</id>
    <content type="html"><![CDATA[<p>While at DevOps Days Atlanta a couple weeks ago, I was introduced to <a href="https://www.docker.io">Docker</a>, which is tool for managing <a href="http://lxc.sf.net">lxc linux containers</a>.  It was some really powerful features that are changing the way some people provide services and doing Infrastructure CI.</p>

<!--more-->


<p> Ever since I saw a presentation on Solaris zones, I have wanted to play with containers in linux, but when I tried <a href="http://openvz.org">OpenVZ</a>, it was pretty heavy-weight &mdash; required a custom kernel, and there was a lot of manual configuration to get it all going.  I never got to play with <a href="http://linux-vserver.org">Linux-Vserver</a>), but I take it that the same thing applied there.  Those two projects have been around quite a while, and they are apparently still the right way to do &ldquo;real&rdquo; containerization with all the security implications that arise when allowing users privileged-level access to the containers.</p>

<p>Over the last few years, the various container projects have slowly been getting parts of their kernel modifications into the mainline kernel, and with the recent finalization of user namespaces, apparently kernel support has hit critical mass.  There has been a lot of movement with lxc in the last month or two, and Docker seems to be attracting a lot of attention.</p>

<p>Docker has two particularly neat use cases: <em>scriptability</em> and <em>service isolation</em>.  The former is pretty standard for the seemingly never ending parade of continuous integration tools that are so popular these days.  Using a <code>Dockerfile</code> (compare to <code>Vagrantfile</code>, <code>Puppetfile</code>, etc.), one can specify a sequence of directives to build, unpack, create, and customize a container.  The end result is very similar to spinning up a vagrant box, except it is much faster.  There is no booting overhead &mdash; just unpack a filesystem, and the container is ready.  Furthermore, Docker caches each step of the process (like with a <code>Makefile</code>), so the next time you create the container, it is essentially instantaneous.</p>

<p>The other use case for which Docker is really known is service isolation.  That is, they encourage creating a container per service.  For instance, if your application stack requires a web server, message broker, and database, create three separate containers.  While this strategy might add some processing overhead, it can greatly reduce the configuration overhead.  Frequently the different parts of a software stack have very specific requirements.  As an example, puppet and foreman are both ruby web applications.   There have been times when I wanted to (or maybe had to) use different versions of ruby for each.  That is possible to do in a single deployment, but it is relatively complictated to set up.  With containers, puppet would run in a container, foreman would run in another, and they each could have their own ruby stack.  There is another advantage if you are running applications that are picky about what port on which they listen.  Docker can map the container port to a different host port.</p>

<p>I think what really draws me to those particular features is that they address two of my biggest time sinks.  I spend <strong>a lot</strong> of time waiting for VMs to spin up and then software to install when testing provisioning scripts.  Spinning up a container is really fast; the timings I have seen online are sub-second after the first time.  I also spend far too long trying to integrate the new hotness into my environment.  With containers, if the software author wrote his installation process for ubuntu 12.04, I can just use a 12.04 container.  If the developer only supports Red Hat, I can do that too.</p>

<p>Wow, this was really long, and that was just one part of the discussion.  The obvious use case for this in my environment is the example I used.  I plan to refactor the puppet master deployment to use Docker for managing puppet master, puppet db, and foreman containers.  Next time I&rsquo;ll talk about <a href="http://coreos.com">coreos</a> as the base for running Docker.</p>
]]></content>
  </entry>

  <entry>
    <title type="html"><![CDATA[First Post]]></title>
    <link href="http://chadh.github.io/blog/2013/10/17/first-post/"/>
    <updated>2013-10-17T19:56:00-04:00</updated>
    <id>http://chadh.github.io/blog/2013/10/17/first-post</id>
    <content type="html"><![CDATA[<p>Not sure if I will stick with this, but I&rsquo;m trying out a blog on github pages.  I like the idea of having my entries in human readable files, checked into git, but I am not sure if markdown is going to be expressive enough for me.</p>
]]></content>
  </entry>

</feed>