riemannmc still in a bit of a state

Realising that I am learning a lot about Unix and infra, but could learn this without ever learning anything about DevOps.  I can see where DevOps could make my life easier but the books I am following remove automation, duplication in order to explain config and the like.

Anyway my Riemann Mission Control is still in a state even when I free up disk space.  In today’s lesson I was setting up the collectd write-riemann plugin and had errors as follows on my problem-child host:

/etc/collectd.d$ sudo service collectd start
 * Starting statistics collection and monitoring daemon 
collectd ERROR: lt_dlopen ("/usr/lib/collectd/write_riemann.
so") failed: file not found. The most common cause for this problem is missing dependencies. Use ldd(1) to check the dependencies of the plugin / shared object.

This lead to my first contact with the ldd command which revealed:

libprotobuf-c.so.0 => not found

Finally with a bit of research the below fixed my issue:

sudo apt-get install protobuf-c-compiler protobuf-compiler libprotobuf-c0 libprotobuf-c0-dev

Having configured this plugin on four ubuntu hosts, I wonder why this dependency was missing on only one.  Two theories.  One – whatever is busting my disk space may be related to the absence of a working dependency.  Two – I may have missed a step on one of the four hosts.

Either way, if had configuration management tools or containerised, automated builds I suspect this error may not have occurred at all or at the very least have been fixable before building four hosts using the same image/container/scripts.

One thought I had was that it would be great to rebuild riemannmc.  Of course this would take a while to retrace my steps unless I had a Docker image to help me out (for example).

The other issue I had today is that collectd on my Red Hat hosts is not logging.  One for tomorrow.

Dog-walking-learning , Riemann mystery continues and the return of LCD Soundsystem

Dog-walk learning

Quiet-ish couple of days on the mission front.  Rugby and work filling a lot of time.  However, in recent months I have learnt to regain time lost on dog-walking by listening to the following wonderful podcasts:

 

All brilliant in their own ways.  My introduction to James Turnbull and his books came from the DevOps cafe.  The Kelsey Hightower episode was a classic too.  The reminders I add to my phone to look at post-walk grows every time,

If anyone can recommend any more podcasts, please let me know.

Riemann mystery continues

Not had time to look at this in any detail, but issue remains.  I clear out the huge files and get an email the following day when the next log files grows to consume all of my space.

It appears the issue occurs, in Art Of Monitoring speak, on my Riemann Mission Control host.  So maybe other Riemann hosts are spamming it.  We will see.   The other two Riemann hosts and all Carbon hosts are performing well.

LCD Soundsystem are back

Finally, it is always a good week when this lot put out new music.  Not related to my mission but I am sure it will soundtrack much of my studying in the coming weeks.

 

 

Art of Monitoring and Today’s Updates

I am a big fan of James Turnbull (https://jamesturnbull.net/) mainly because his Docker and Art of Monitoring books have been incredibly relevant to one of my day jobs.  I really knew nothing about Docker until I went through his book step by step.  It has since occurred to me that since I am starting from such a low base, many of the benefits of Docker over non-container technologies are probably lost on me.

That being said, I am about a third of the way through The Art of Monitoring (https://artofmonitoring.com/) and having launched six AWS EC2 instances and then configuring each with Riemann, Grafana, Carbon and Collectd, I think I get it better than I did before.  In fact I have added being able to containerise these tools to my learning and to do list – which is getting longer every day (and I may include it on this blog at some point).

In fact I have added all of the James Turnbull books to my to do list as  I see them as being critical to achieving my mission.

In today’s updates, I have been getting emails like this:

This is quite exciting.  Turns out Riemann logging has been going barmy on one of my six EC2 instances:

No idea why yet but has further added to my to do list:

  • Why is this happening and why on only one host?
  • How do I set up a log rotation policy (although it would have to be severe to control this scenario)?
  • Why are the email notifications bouncing with “address not found”.  (I forgot to mention that.)

Back soon.