Wednesday, January 14, 2015

Using Splunk to extract XML and JSON fields using spath, but the 5000 character limit prevents it from getting everything

Some events had xml that was longer than 5000 characters, and spath wasn’t extracting all the fields I knew were in there.


Here’s how to fix it:

Override the spath character limit in $splunk_home%/etc/system/local/limits.conf.


My exact edit was to add the below config section to /opt/splunk/etc/system/local/limits.conf (since it wasn’t there be default in 4.3.3). I pulled this from /opt/splunk/etc/system/default/limit.conf:


[spath]

# number of characters to read from an XML or JSON event when auto extracting

extraction_cutoff = 10000


There are a number of unanswered Splunkbase questions:


http://ift.tt/1AgfqPN


http://ift.tt/1AgfqPN


spath docs don’t mention an override available at search time: http://ift.tt/1AgfrTy


spath is fantastic, btw. It auto-extracts XML and JSON. It can even do JSON embedded within XML fields. Just do “spath input=fieldthatcontainsxmlorjson”. However, if you have potentially large XML fields you will need to increase the limit on the number of characters spath looks at.





No comments:

Post a Comment