Friday, January 30, 2015

Wednesday, January 28, 2015

Installing Graphite for Dashboards of System Metrics

I’m doing this on a CentOS minimal install, and all the Graphite “stuff” is going in the default locations. I hear you “don’t need as much space as you think”, but that you do want to use fast disks (read “SSDs”).


Graphite install steps:


http://ift.tt/1Ermfok


2 Keys:

1. Install EPEL

2. install pip via yum (not get-pip.py)


1. yum update -y

– yeah, this can take a while, but it’s worth it


2. yum install wget


3. Install EPEL (this is key; otherwise all the following steps will mysteriously fail):

wget http://ift.tt/1htTSIk

sudo rpm -Uvh epel-release-6*.rpm


4. yum install python-devel


5. yum install python-pip


6. pip install http://ift.tt/15ToC42


7. pip install whisper


8. pip install carbon


9. pip install graphite-web


From here follow this step-by-step:


http://ift.tt/1Ermfom


EPEL install:


http://ift.tt/1sfqNHu


I had some problems with installing carbon. some “gcc” comand failed in the very last step, throwing angry red text (like this: http://ift.tt/1ErmeRg). Anything that takes that much work is worth abandoning. Then I saw yum has it (in EPEL). Well, I couldn’t figure out how to uninstall pip. heh heh:


pip uninstall pip





Friday, January 23, 2015

Amazon AWS EC2 instance was configured with a larger drive but I don’t see that

Well, if you make the single drive larger than default, the extra space won’t automatically show up – well, it hasn’t in my experience.


You have to run resize2fs so “see” it.


Below is what it looked like for me. Notice there was no indication of free space on the drive.


[root@……..]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/xvde 7.9G 1.1G 6.5G 15% /

tmpfs 1.9G 0 1.9G 0% /dev/shm

[root@………..]# resize2fs /dev/xvde

resize2fs 1.41.12 (17-May-2010)

Filesystem at /dev/xvde is mounted on /; on-line resizing required

old desc_blocks = 1, new_desc_blocks = 32

Performing an on-line resize of /dev/xvde to 131072000 (4k) blocks.

The filesystem on /dev/xvde is now 131072000 blocks long.


[root@…….]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/xvde 493G 1.1G 467G 1% /

tmpfs 1.9G 0 1.9G 0% /dev/shm


After a while, it came back and I had the 500GB I had configured the instance to have.


While resize2fs is running, if you type “df -h” repeatedly you can see the volume growing. Good fun when you’re staring at an SSH console ;)





Thursday, January 22, 2015

Sharing OSX files over the network with Windows 7, 8, and 8.1

In OSX go to System Settings > Share > then look for sharing files with Windows. Heh heh.. That’s great and all. But in my case it didn’t work. “incorrect password” or whatnot. And I kept hitting that wall.


Well, the solution was to “uncheck all the things”, save, then “check all the things”. By that I mean to, say, uncheck/deslect/turn off the OSX accounts that are given permission to be used in SMB file shares from Windows machines. Then save. Then recheck/select/turn on them.. Yeah, it’s really that boneheaded ;)


But the reason is that somewhere under the hood, the configs got clobbered, probably by some system update. So, even though the OS is displaying that the user should work for fileshare, it’s not actually that way.





Wednesday, January 21, 2015

Tuesday, January 20, 2015

Saturday, January 17, 2015

Splunk: How to effectively remove a field from results if there are no non-null values in it

In my case, I needed to use rex to extract a “message” field that may or may not be present in an event, but if it was it could be really dirty (since it’s user-generated text). However, rex has the side effect of *always* creating the field you specify, even if there is no actual match in the event. As a result, every search had that rex-extracted field name, which was not desired (and confusing to see a blank “message” field for, say telemetry events).


Create a new field set to the value of the field you may want to get rid of then get rid of the original field, e.g.


say the field in question is named “possiblynull”




| eval maybenull=possiblynull

| fields – possiblynull

| rename maybenull as possiblynull


This way, if the original field is actually empty, your search results will not end up with it present.


A few Splunk Answers related to this scenario:


http://ift.tt/1xgQkiE


http://ift.tt/15bH4p7





Friday, January 16, 2015

Splunk subsearch where you want it to only return a single value

By default a Splunk subsearch returns something of the form “fieldname=24″. If you only want it to return the “24” part, just name the field in the subsearch “query”. Yeah, it’s a magic term for just such a scenario.





Splunk DB Connect limits

I wanted to use Splunk DB Connect to automatically incrementally query data from a database server based on data found in another database server.


So the idea was this:


Use DB Query to get the most recent date found in the destination database table, pipe that to a “map” command that allows me to specify the second dbquery on the source database table (specifying the results of the first dbquery as the starting date), and then doing some filtering and transforms, and then using dboutput to send the results to the destination table.


Well, that doesn’t work. It errors out with some null pointer excetption. It all works fine as soon as I remove the “map” command segment, which means I have no way to perfectly get only those records that are new since the last extraction.


My guess is that since the “map” command fires of one query per result row from the previous search, it means the dboutput can’t reference a single search to use as the source of data it will insert. I suspect it may be possible they could fix this so it looks at whatever the results were that are being piped into it. Currently, I am guessing it is looking for the base search.


I expect the current behavior is actually expected, so I bet the behavior I’m hoping for would be a feature request.


Here’s the Splunk Answers question I submitted for this:


http://ift.tt/1udaB7N





Thursday, January 15, 2015

Splunk and XML and JSON

Even with Splunk 4.3.3, it can automatically pull out XML and JSON fields at search time.


This means you can query a database table in real time, generate a table of data where each column is an XML and/or JSON element, then push it all to another DB table.


“spath” is the search command you want to use, and usage is something like this:






| spath input=MyFieldThatHasXMLInIt


Yeah, that’s all; it’s seriously great. Note by default spath only looks at the first 5000 characters in that field, so if you have larger fields you will need to override the system defaults by adding the below ini section to /opt/splunk/etc/system/local/limits.conf (adjust the value to whatever is appropriate for your situation). I got this section from /opt/splunk/etc/system/default/limits.conf and tried just pasting it in and restarting Splunk:




[spath]

# number of characters to read from an XML or JSON event when auto extracting

extraction_cutoff = 8000





Wednesday, January 14, 2015

Using Splunk to extract XML and JSON fields using spath, but the 5000 character limit prevents it from getting everything

Some events had xml that was longer than 5000 characters, and spath wasn’t extracting all the fields I knew were in there.


Here’s how to fix it:

Override the spath character limit in $splunk_home%/etc/system/local/limits.conf.


My exact edit was to add the below config section to /opt/splunk/etc/system/local/limits.conf (since it wasn’t there be default in 4.3.3). I pulled this from /opt/splunk/etc/system/default/limit.conf:


[spath]

# number of characters to read from an XML or JSON event when auto extracting

extraction_cutoff = 10000


There are a number of unanswered Splunkbase questions:


http://ift.tt/1AgfqPN


http://ift.tt/1AgfqPN


spath docs don’t mention an override available at search time: http://ift.tt/1AgfrTy


spath is fantastic, btw. It auto-extracts XML and JSON. It can even do JSON embedded within XML fields. Just do “spath input=fieldthatcontainsxmlorjson”. However, if you have potentially large XML fields you will need to increase the limit on the number of characters spath looks at.





Tuesday, January 13, 2015

Using Splunk as an ETL Tool for Data Residing in a Relational Database

Use Splunk to Extract, Transform, and Load data from an existing OLTP database into another OLTP database. It’s especially great if your source data has XML or JSON (imagine JSON stored in an XML field – Splunk can handle that no problem).


In my case, there was data in a table that stored web service requests and responses.


I used DB Connect (the Splunk app) to query the last 2 days’ worth of data, extracting the XML key/value pairs and cleaning up the data so the Splunk search output a clean table of data. I then filtered that data with a Splunk subsearch that returned the most recent request time already in my destination table (on a different SQL Server instance). Then I piped it all to dboutput pointed at the table. The table was created to ignore any duplicate keys, just in case. Then, I saved the search and scheduled it to run every hour.


Oh yeah, since none of this data is indexed in Splunk, it doesn’t count against your daily licensing limits :)


Here are what some of my Splunk search looked like (I’ll try to build it up in stages). I’m running Splunk Enterprise 4.3.3, btw:


First use dbquery to get the data from your source table.


1. Install DB Connect app (I got the latest version)


2. Set up the database connection to your source DB


3. Set up the database connection to your destination DB (make sure to uncheck the “Read Only” box, otherwise you’ll get messages about the database being read-only”.


4. My dbquery ended up looking something like this (queries from the Splunk external database connection named “DWH”):


| dbquery DWH "select * from LogDB.dbo.SourceServiceLog l (nolock) where l.RequestTime > dateadd(dd,-2,convert(date,getdate()))"


5. Use Splunk Search commands to clean up the data (I’ll leave this to you and the docs… but I’ll have more posts on clever uses)


6. Pipe your search results to dboutput. Mine looked something like this (it naively inserts the fields specified at the end of the command into the table splunktest in the database “MyDatabaseName”):


| dboutput type=insert database=MyDatabaseName table=splunktest key=Character RequestTime UserId RequestType


Note, when specifying the table to insert into with dboutput you cannot fully qualify a table name, e.g. MyDatabaseName.dbo.splunktest.


Note, the select query with dbquery can be any complex query you want. Definitely avoid any double quotes though.


This technique does not seem to count against your daily indexing license limits.





Monday, January 12, 2015

Puppet Forge Module Installation – make sure they’re in the right directory

I had been working on setting up the seteam-splunk module on Puppet 3.7 (for managing Splunk installations). Besides the fact that the module doesn’t purge changes to the inputs.conf files, I hit the following error when I went to install it in our production systems:

Could not autoload puppet/provider/splunkforwarder_input/ini_setting: undefined method `provider’ for nil:NilClass


In my case, this was due to the fact that when I installed the module, by default it was installed into different folders. I had to specify the target directory for each of the 3 modules involved:


puppet module install seteam-splunk –target-dir /opt/puppet/modules/


You may have to even uninstall the 2 dependency modules and reinstall them, explicitly setting the target directory (the –force flag is because it’s a dependency of another module:


puppet module uninstall puppetlabs-inifile –force

puppet module install puppetlabs-inifile –target-dir /opt/puppet/modules/


puppet mmodule uninstall nanliu-staging –force

puppet module install nanliu-staing –target-dir: /opt/puppet/modules/





Friday, January 9, 2015

What does this mean: Error: No such file or directory – getcwd

When working in Puppet, I’d occasionally get this at the top of a slug of scary red text errors:


Error: No such file or directory – getcwd


Well, it really just means your current working directory doesn’t exist. Usually, I had just deleted some directory and I was in that path at the time.


“getcwd” means “Get Current Working Directory”.


Just like “pwd” mean “Print Working Directory”


Figured the internet could always use another bit of information on something a new Linux user may not have encountered [yet].





puppetlabs-splunk Puppet Module does not purge non-specified ini file elements

Here is a post that outlines the problem I hit:


http://ift.tt/1wDxtxX


I have not been able to identify where the bug is.


In short, a Puppet module should allow you to specify the Splunk inputs.conf contents. However, this module currently seems to have a bug where it won’t remove something you haven’t specified.


I’d really rather just use the module and not have to modify it to make it work.





Wednesday, January 7, 2015

Impala 2.0 dies if you query on gzip files that are too big to fit in memory

Say you run Sqoop2 to import a very large table – say 6 billion rows – and you have it output to gzip.


Well, it’ll happily do all that, but you will end up with files of 3.5 GB or something. Yay! How space-efficient!


Then, you make an external table that points at those files and query with Hive. an hour or so later you’ll get results. Yeesh. Well, maybe Impala is faster….. hmm… error message of this:


“Bad status for request 708: TGetOperationStatusResp(status=TStatus(errorCode=None, errorMessage=None, sqlState=None, infoMessages=None, statusCode=0), operationState=5, errorMessage=None, sqlState=None, errorCode=None)”


What’s happened is you’ve killed Impala because you don’t have enough memory on the machines (and that’s what Impala is configured to do if Impala is what takes up all the memory on a node. You’ll see Impala is dead in the Cloudera Manager main dashboard, so restart it. Since the data is all in gzip files, it has to decompress each file in order to do stuff on it, and each of those 3.5GB files will blow up to, well, a whole lot bigger than 3.5GB.


You have to use something like Filecrush to break up those files into smaller pieces, preferrably something no larger than your HDFS block size (which Filecrush will happily do by default).


I’ve got some posts about setting up and using Filecrush:


http://ift.tt/14pOyE3


Filecrush project:


http://ift.tt/1gzGOis





Monday, January 5, 2015

Random numbers in TSQL queries

Say you wan to generate test and control groups directly in the results of a query, with each row being in either a control group or the test group.


Here’s one way to roll a random number between 0 and 1:

select ABS(CONVERT(BIGINT,CONVERT(BINARY(8), NEWID()))) % 2


Here’s how to roll a random number between 0 and 2:

select ABS(CONVERT(BIGINT,CONVERT(BINARY(8), NEWID()))) % 3


0 and 5:

select ABS(CONVERT(BIGINT,CONVERT(BINARY(8), NEWID()))) % 6


0 and 18:

select ABS(CONVERT(BIGINT,CONVERT(BINARY(8), NEWID()))) % 19





Friday, January 2, 2015

YARN not starting

Here’s the error I saw (from ‘Recent Log Entries” in Cloudera Manager after clicking on the Details for the failed YARN startup step when restarting the Cluster):


Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Corruption: 13 missing files; e.g.: /tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state/000005.sst


There is a bug already submitted in Jira for YARN that seems to encompass this error I saw. It also seems to include a workaround:


http://ift.tt/141Wo6x


Fix:


In short, remove or rename the CURRENT file in these 2 paths and then restart YARN (or delete the files, or I think you could even just reboot each affected node since the /tmp folder may be cleared out on reboot):


/tmp/hadoop-yarn/yarn-nm-recovery/yarn-nm-state


/tmp/hadoop-yarn/yarn-nm-recovery/nm-aux-services/mapreduce_shuffle/mapreduce_shuffle_state