Transforming Windows Event Log Analysis using AWS: A Scalability and Visualization Journey
This project explores innovative ways to analyze and visualize Windows Event Log files using AWS. Encountering challenges in data access and scalability, I developed a log file analyzer that leverages AWS's parallel computing capabilities to handle 6GB of diverse log data efficiently. The goal was to not only identify patterns in errors but also to create a visually engaging representation of the analysis. Throughout this journey, I encountered numerous obstacles, learned valuable lessons about cloud computing, and showcased the importance of effective data visualization.
Transforming Windows Event Log Analysis using AWS: A Scalability and Visualization Journey
E N D
Presentation Transcript
VisualizeME Kia Manoochehri
Motivation • Data Analysis • Learn about AWS • Struggle to do so • Data Visualization • An exercise in scalability
Motivation Do something different.
Introduction • Original intent: • Gather Windows Event Log files • Determine when an error would occur • Visualize this somehow
Introduction • Problems: • Gather Windows Event Log files • No access to them outside of my own • Determine when an error would occur • Easy if I had access to a monumental amount of data • Visualize this somehow • Complicated because *what* would I visualize?
Introduction • Actual design: • A Log File (data) Analyzer of a different system • Ran through AWS to exploit parallelization of the log files • Exercise in scalability • Visualize this data in a “meaningful way”
Side note about the Data Came from another project I previously worked on .txt files of varying size Total Volume of data = 6gb
Potential Results • Show that using AWS can relieve problems • More data? • Errors? • Runtime? • Have a cool visualization tool! • Data in an easy to read way.
Design Decisions • MATLAB • Many pros • Many many cons • AWS • We weren’t given any other option • Project itself
Problems Encountered • MATLAB • Able to use the “Parallel Computing Toolbox” • Costs $$$ - Need licenses for the cloud • Solution? • Be dirty…
Problems Encountered • AWS • Came into this class with 0 cloud experience and knowledge • Solution? • Spent more time outside of class learning and reading about the cloud and running an application on AWS than inside of class.
Problems Encountered • Project itself • Multiple aspects to this project • Analyzer (coding) • Running it on AWS (design, choices, and 0 exp) • Visualizing the data (coding and design) • On the surface: • Not an impressive use of AWS • Good lesson on Scalability aspect of Cloud Services • Great lesson on Trust/Security of data…
Implementation MATLAB on EC2
Implementation • Development and Testing: • My home desktop and laptop • GIT makes this easy • After Toby’s presentation on AWS early in the semester I chose to isolate my development and testing environment (from the cloud).
Performance • Scalability: • Time and effort saving • Fundamentally, space saving • >6gb (6144 MB) of .txt files compressed/converted to 229MB (wow) • Visually pleasing?
Lessons Learned • AWS • Application of some of the topics we discussed • First hand account of security issues and reluctance to use the Cloud • Don’t over commit… • I was already addicted to Caffeine.