Cascading – Hadoop *nix utils

by Security Dude

One of the best parts of my week are eating the yummy food that Etsy provides for us on Tuesday and Thursdays. It’s also an awesome opportunity to eat lunch with the wonderful engineers at Etsy. I had the pleasure of meeting an engineer from the “data team”. We had a quick conversation about the technology stack for their data analysis.

Tha dude told me about Cascading. Frankly, I never heard of the tool, so I had to look it up. I’ve seen some other tools like this for processing workflow for the film and animation studios where “pipeline” tools took the concepts of  grepsedawkcutjoinuniqwctrcatsort to awesome routine work challenges.  Cascading is the Hadoop version of  grepsedawkcutjoinuniqwctrcatsort.

My jaw is still open. I can’t wait to process large click stream type, firewall, and IPS/IDS traffic.  Maybe DFIR stuff. (Close mouth).