480 likes | 634 Vues
The Needle in the Haystack: Find the Offending File. Robert K. Henry CISSP, GCIH, GCFA Information Security Officer. HR Has an Employee Grievance. Hostile Workplace – Sexual Harassment Inappropriate/offensive files stored on web server and displayed in office College Staff Already Involved.
E N D
The Needle in the Haystack:Find the Offending File Robert K. Henry • CISSP, GCIH, GCFA • Information Security Officer
HR Has an Employee Grievance • Hostile Workplace – Sexual Harassment • Inappropriate/offensive files stored on web server and displayed in office • College Staff Already Involved
College Investigation • Course Site Files Deleted • Six weeks prior to HR grievance report • No Backups! • Backup System on the fritz at time files were deleted
College Investigation • How do we get the goods? • College systems admin made manual backups to local PC drive • Not removed from local drive after backup system was repaired
The Mission: • Find inappropriate material among 6 GB of mixed images, word-processed, and text files. • Identify owner/creator of files > 7000 files
Search Options • Manual • grep • ssdeep • foremost • sorter • Content Based Image Retrieval, CBIR • Evaluation Criteria: • Easy! • Free!
Search Options • Manual (The First Responder's Strategy) • Thumbnails • Slide Show • One-at-a-time • zzzzzzzzzzzzzzzzzz! • Too much room for error • Pretty Inefficient (32 hours of searching) • Two people spent two workdays each going through the DVD's
Search Options • But . . . • it worked! • Identified inappropriate word-processed files and images in one directory on one of the DVD’s • Due to multiple file copying, creator/owner of files doesn't show up in Windows file properties • Did I mention the files were uploaded via ftp with shared userID’s? • Not much accountability!
Search Options There’s gotta be an easier way!
Search Options-- grep • Built-in *nix string search command also available for Windows • Steps to conduct search with grep (1) • Make a forensic image of the disks #dd if=/dev/sr0 of=dvdimage.img conv=noerror,sync
Search Options--grep • Steps to conduct search with grep (2) • Extract Strings • Ascii strings first #cat dvdimage.img | strings --radix=d dvdimage.img > dvdimage.str • Unicode strings second #cat dvdimage.img | srch_strings -t d -e > dvdimage.uni.str
Search Options--grep • Steps to conduct search with grep (3) • Examine Strings Files • Create “dirty word” file • Use “dirty word” file to search strings for, well, dirty words #grep -f dirtyWords.txt dvdimage.str > grepOutput.txt #grep -f dirtyWords.txt dvdimage.uni.str > grepOutput.uni.txt
Search Options--grep • Results • process sounds a little involved, however . . . • Took about 30 minutes to image DVD’s and run commands. • Not Bad! • Identified Word-Processed files with inappropriate jokes • Doesn't get image files (didn't expect it to) • Doesn't Identify Creator of files • Zero non-repudiation • Doesn't help investigation confirm or deny ownership of files • Bonus: found survey data with Too Much Information • Protected student information in clear text
Search Options--ssdeep • linux and Windows • http://ssdeep.sourceforge.net/ • Uses fuzzy hashing • A “partial” or “inexact” hashing of files to identify similar files • Its author, Jesse Kornblum, even uses the phrase “finding needles in haystacks” in his documentation! • Haven't heard of it being used to find questionable pictures, but why not give it a try?
Search Options--ssdeep • “ssdeep! Go find files in the test directory that look like files in the “homeStuff” directory!” #ssdeep -lrd test homeStuff • Bummer-- • Identified exact matches only
Search Options--ssdeep • Need to try carving out portion of file for true fuzziness • Skip the first 20 blocks (header info and more) of file and cut out the next 70 blocks for the hash comparison: #dd if=dsc00219.jpg of=219partial.jpg skip=20 count=70 • Create file for comparison #ssdeep dsc00219partial.jpg > testhash.txt • Compare fuzzy hash of image to images in directory #ssdeep -lrm testhash.txt homeStuff
Search Options--ssdeep • Results: • Not Promising • Can check for similarities in files on a file-by-file basis, but that's too much like a manual search • Can easily find exact matches • so you must have the file you are looking for ??? • However . . . • Useful for an intellectual property issue or finding known bad files
Search Options--foremost • linux and Windows • http://sourceforge.net/projects/foremost/ • Identifies files based on a database of file headers and footers • Find a list of most file headers at http://www.wotsit.org
Search Options--foremost This is the header of a gzip file displayed in a hex editor The gzip header is 0x1f 0x8b 0x08
Search Options--foremost #foremost –o pathToOuptutFile –c pathToConfigFile pathToImageFile
Search Options--sorter • linux and Windows • perl wrapper for several Sleuthkit tools http://www.sleuthkit.org/ • Runs against a disk image • Finds active or deleted files • Then displays thumbnail view of the files
Search Options--sorter #sorter –s –d pathToutputFile pathToInputFile
Search Options--sorter • Results • Save many steps compared to foremost • Still have a bunch of thumbnails to look through
Search Options There’s gotta be an easier way!
Search Options--CBIR • Content Based Image Retrieval • Commercial Versions Available • My Office (me) too cheap—didn’t even look into commercial options! • Free and Open Source • imgSeek • Linux and Windows http://www.imgseek.net/ • Gnu Image Finding Tool • Linux http://www.gnu.org/software/gift/gift.html
Search Options--CBIR • ImgSeek Demo
Lessons Learned • Mission Accomplished! • Not so much • Found inappropriate material among 6 GB of mixed images, word-processed, and text files • Failed to identify owner/creator of files • Identified a potentially useful tool
Lessons Learned • Need to develop incident response procedure for entire organization • Procedure for breaches of Personally Identifiable Information and Payment Card data are on the books • Procedures for responding to HR requests needs documentation • And needs distribution to de-centralized IT units
References: • The Sleuthkit (includes sorter) • http://www.sleuthkit.org/ • foremost • http://sourceforge.net/projects/foremost/ • ssdeep • http://ssdeep.sourceforge.net/ • imgSeek • http://www.imgseek.net/ • GIFT (Gnu Image Finding Tool) • http://www.gnu.org/software/gift/gift.html • Presentation available at: • http://boisestate.edu/oit/iso/HTCIA&CBIR.ppt