820 likes | 973 Vues
FW. To Critical Users. GW. VPN. Switch. Primary Nodes. Monitor & Adapter. Backup Nodes. Sensors. Decoys/Fishbowls. Controls. Server. Server. Server. Server. Server. Server. HACQIT: H ierarchical A daptive C ontrol of Q oS for I ntrusion T olerance. James E. Just
E N D
FW To Critical Users GW VPN Switch Primary Nodes Monitor & Adapter Backup Nodes Sensors Decoys/Fishbowls Controls Server Server Server Server Server Server HACQIT:Hierarchical Adaptive Control of QoS for Intrusion Tolerance James E. Just James C. Reynolds Karl Levitt 13 February 2001 The UC Davis Computer Security Laboratory
Outline • Team • HACQIT idea • Goals • Architecture • Status • Plans • Current Capabilities • Questions and Issues
HACQIT Team • Teknowledge Corporation – architecture, design, Quorum component modification, monitor/adapter development, integration • J. Just • J. Reynolds • L. Clough • R. Maglich, E. Lawson • UC Davis – attack modeling, sensing and response options • K. Levitt • R. Pandey, F. Wu • J. Rowe • M. Tylutki
The HACQIT Idea • Utilize robust hierarchical control of QoS and other fault tolerance techniques to deliver critical COTS services to critical users while under attack • Significantly raise adversary work factors • Focus on useful military applications • Policy driven • Leverage current and new technologies • QoS/Quorum – DeSiDeRaTa, AQuA, QCS, others • IA&S – wrappers, intrusion and integrity sensors, active monitoring & response, randomization, VPNs, attack modeling (Jigsaw concepts), honeypots • Fault tolerance – separation, diversity, replication & check-pointing, fail-over • Others – out-of-band signaling, etc • Incrementally deliver capabilities
Project Goals • Prototype HACQIT controlled cluster delivering • 4 hours of intrusion tolerance under • Active Red Team attacks on hosts • Providing policy determined critical services from • COTS/GOTS applications to • Critical users (also policy determined) at • 75% capacity • Extensible model base for Intrusion Tolerance • Focus • COTS HW & SW based for near term utility • Architecture based framework for longer term extensibility (hierarchical and fractal)
HACQIT Scope • Will address • QoS control for critical services to critical users • Hierarchical, extensible object control model • Attacks on host availability and integrity • Variety of COTS/GOTS applications • Policy specification of above • Won’t address • Network infrastructure (e.g., denial of service attacks on bandwidth, routers or LANs) or physical attacks • Integrity of data sources used as inputs, confidentiality • Legitimate but false insider manipulation of application • Developing new sensors or mechanisms, but will leverage them
HACQIT Reference Architecture FW GW Switch = Controller = Sensors = Attacker Monitor & Adapter Server = Non-Critical Service User = Critical User User = Non-Critical User HACQIT Protected Enclave Other Enclaves User User 1 q User 2 User J o User 3 User * i User N LAN WAN F W Server p Server q Server r User M Noise Generator User P Out-of-Band Comms Between HACQIT M/A’s & Cyber Panel?? HACQIT Protected Node Key Server = Critical Service
HACQIT Node Architecture FW GW Switch Server Server Server Server Server Server To Critical Users VPN Out-of-Band Control Pathways Primary Nodes Communications with other Controllers Monitor & Adapter Backup Nodes Sensors Monitor-adapter uses Out-of-Band signaling for complete separation from network attacks on LAN and WAN Decoys/Fishbowls Controls
HACQIT Monitor-Adapter Software Overview HACQIT Visualization HACQIT Controllers IT Event Log, Other Clients HACQIT Monitor/Adapter To /From Other M/As Policies and Specs. Mediator Mediator Mediator Integrity & State Sensors Intrusion Sensors Performance Sensors
Goals of Control (Increasing Difficulty) • Continue critical service • Migrate critical applications • Administer system (e.g., add or remove critical user or critical service) • Gather more information (e.g., refocus sensors, turn on more intrusive sensing, use decoys or fishbowls) • Stop current attack • Note: Control over enclave firewall and critical user protection features needed • Stop future attacks
Paradigm for Responding to Integrity Intrusion • REPEAT UNTIL ATTACK SYMPTOMS DISAPPEAR • Detect integrity violation on a critical file • Switchover to backup server; restore prior version of critical file on primary • Use Jigsaw model to determine possible causes and sources of attack • Deploy sensors and responders as determined by model • If attack persists block with responders
Paradigm for Responding to Internal DOS Attack • REPEAT UNTIL ATTACKSYMPTOMS DISAPPEAR • Detect denial of service violation on primary server • Switchover to newly created process on server; kill process causing denial of service • Use Jigsaw model to determine possible causes and sources of attack • Deploy sensors and responders on server and on firewall as determined by model • If attack persists block with responders
HACQIT Actions in Responding to Connection Spoofing Attack • Detect change to .rhosts file on primary • Switchover to backup • Restore previous version of primary, which is now the backup • Use Jigsaw model of attacks to identify possible causes of integrity problem • Change is legitimate by “clean” process-- no integrity problem • Change is by an unauthorized process • Change is by a legitimate rcommand • Change is by an unauthorized rcommand
HACQIT response to Connection Spoofing (cont) • HACQIT checks for erroneous processes -- finds none; so conclude change is legitimate or due to an rcommand • HACQIT starts monitoring for rcommands • Attack persists, but now on backup with arrival of rcommand • HACQIT temporarily blocks rcommand until verification • HACQIT monitoring detects symptoms of connection spoofing attack -- sequence number guessing, DOS on a host • If traceback to true source s is possible, connections from s are blocked; otherwise, degraded mode (no rcommands)
Example w/ Capabilities ExecuteCommands Remote Execution RSH Connection Spoof Prevent Connection Response Forged Src Address Seq. Number Guess Spoofed Connection Spoofed Packet Connection Spoof RSHActive cat + + >> /.rhosts RemoteLogin Packet Spoofing Synflood Seq # Probe Address Forging Example attack composed of multiple concepts and capabilities
Coverage: Illustrative Application “Types” What follows is a first cut. Not completely clear what the minimum set of characterization “axes” is to give HACQIT the necessary robustness. • Human user, client side applications, central storage, e.g., MS Office • Human user, client-server, e.g., web servers or applications • Human user, client-server-database (three tier), e.g., shared planning applications, web based applications • Human user, store and forward applications, e.g., email – sendmail or Exchange • Human user, communication and collaboration applications, e.g., CVW or Odyssey or Net meeting • Real time, server to server or server to server to … to server, e.g., radar processing or weapon system controls • Others as necessary
First Round Capabilities Demonstrated • Manual migration • Simulation of attacks resulting in: • Soft reboots • Hard reboots • Simulation of integrity attack (Tripwire) • Simulation of performance degrading attack (cpuhog) • Detect “runaway” host process • Detect QoS degradation • Second round capabilities described in Part 2
What We’ve Done & Status • Second round of “attack & solution space” exploration • Test applications include Dynbench, Apache web server, Notepad – (Exchange or sendmail in works) • Refining architecture & design -- HACQIT requirements and component responsibilities being refined • DeSiDeRaTa code conquered (or at least subdued) • Cross project coordination underway • Attack and response modeling begun • New code developed for: • Secure Task Manager and heartbeat monitor • Sensor manager, e.g., Tripwire and wrappers • Response managers: e.g., firewall and auditing • Policy driven wrappers • HACQIT monitor/adapter
Plans • Continue coordination and leveraging activities • Development • Continue design, rapid experimentation (risk reduction), and research efforts through February • Write specification for first real prototype in March • Develop solid prototype for June evaluation • Evaluation – Red Team, informal hacker exposure via internet, other • Go into next cycle • Leverage • SCC – firewall on NIC (ADF) • Draper – Gateway “cleaner” • Other activities • IFIP WG 10 Dependability Benchmarking
Current Demonstration Purpose • Use of policy driven wrapper technology to intercept suspicious calls and initiate failover • Use of Quality of Service manager to effect switchover • Use of diversity among primary and backup to reduce likelihood of renewed attack against backup • Lab capability to test protective measures against actual attacks and mitigate their effects on simulated users
HACQIT Demonstration Configuration • HACQIT primary NT running Apache and/or Exchange and/or MS Word • HACQIT backup is Linux • Outside of HACQIT cluster firewall are three client workstations, Good and Continuing, two “weak” legitimate user clients, and Bad, source of malicious attacks • Bad could be inside or outside the Enclave that is protected by the second firewall • Bad is inside the enclave firewall for convenience • Eventually a user will be outside the enclave firewall • Secure channel to the client • Currently VPN is not used • Eventually will use VPNs or IPSec.
Current Demonstration Scenario • Users (Good and Continuing) running NT in enclave • Processes on Good and Continuing simulate client demands on Apache • Other processes will simulate demands on Exchange or saving Word files on primary • Automated attack launched from Bad to take over Good and then attack web server on Primary (NT) • Exploit vulnerabilities in Apache or Exchange or Word • Attack executes a program and/or modifies file • Primary attack detected & mitigated by Apache wrapper • Wrapper communicates with the Monitor/Adaptor on our out-of-band machine • Monitor/adaptor starts Apache (under Linux) on the backup and tells firewall controller to switch IP address from the primary to the backup • Exchange or Word could start using Wine or VMWare technology
New Issues • Can we capture or redirect client requests so that users really are not interrupted during a migration? • What does this mean for real time? How does Desi do failover without the DynBench applications losing something? Do they stop processing until the connection is reestablished? • How can we save state and migrate an application? Do different types or classes of applications have different requirements? What is HACQIT’s ability to cover these different classes? • Can we add a new user? We might want to disable all sessions and start over with trusted connections
Incremental Implementation Approach (I) • Several capability levels are envisioned • Lower levels are specified • Level 0 – Insider attack and simple migration • No firewall on HACQIT cluster • Only critical application is Apache web server (no Microsoft Exchange) • Simulations of user web server activities would be running on both Good and Continuing • Attack from weak client Good against Apache web server on primary • Sense compromise via wrapper integrity checker on Apache which then communicates with Monitor/Adapter (M/A) • M/A migrates Apache web server from NT primary to NT backup
Incremental Implementation Approach (II) • Level 1 – Outsider attack with simple, cross platform migration and increased sensing • Firewall(s) added to HACQIT cluster • Added simulations of user web server activities should be added to legitimate user machine outside the enclave • Bad attacks weak client, Good, to compromise it and set up attack from Good against Apache web server on NT primary • Same sensing and communication by wrapper as above • M/A migrates web server to Linux • M/A turns on increased auditing on firewall/gateway as another response
Incremental Implementation Approach (III) • Level 2 – Uninterrupted Critical User during Above Attack and Migration • Demonstrate uninterrupted user via change ARP table • Note: we may still lose the response to the last user request for web services • Level 3 – Block the Attack • Identify the IP address of the attacker and block attacker at firewall or router, e.g., add blocking command via OPSEC interface or change a rule to shut out attacker’s access to primary • At some point we’d like to know if the attack was from a compromised weak client or an insider – probably not at this capability level
Incremental Implementation Approach (IV) • Level 4 – Multiple Critical Applications • Add mail server (Microsoft exchange or some mail server that is cross platform) • Level 5 – Critical Application with State • Save state of critical application and migrate • Level 6 – Same Machine Failover • Use wrappers to ensure non-compromise of OS and failover critical application to the same machine (i.e., start up clean application on same Primary and kill attacked process) • Level 7 – Remediation of Compromised Primary • Level 8 – Other Types of Diversity & Use of Decoys • Level 9 -- Randomization of Responses • Note: Levels 4-9 are relatively independent and can be done in parallel or in a different order
Architectural Explorations • Major focus of HACQIT is to develop Intrusion Tolerant – oops! – I mean Organically Assured and Survivable architecture • Our levels of capability demonstrations suggest the Subsumption architecture (Brooks, 86) • Brooks developed the architecture for his famous robot projects but many of our requirements are the same • Need certain amount of “stupid” reactive behavior • Need guaranteed fast response • Brooks implemented each layer in his architecture as a deterministic finite state machine with simple I/O • No world model is depended on • Communication from higher to lower levels is done through suppression and injection
HACQIT Mapping to Subsumption Architecture from Levels 0-2 Capabilities • Pre-Level 0 capability features could be lowest layer like Brooks’ “Avoid” module • Unauthorized process on primary boosts CPU utilization above threshold: kill process, move critical service to backup • Unauthorized modification of file: move critical service to backup • Level 0 capabilities would be second layer • Wrapper intercepts suspicious call: move critical service to backup • Diversity advantage: Backup runs different OS than primary • TCPDump is turned on after suspicious call is intercepted (heightened awareness) • Level 1 would be third layer • Migration is effected without interrupting critical users (change ARP table) • Level 2 would be fourth layer • Source address of attack is identified • Address blocked by change to firewall policy
Higher Levels of Capability • Multiple critical applications • Failover which saves state • Failover on the same machine • Forensics • All may be too complex, long-lived, or require global information in order to implement • We’re looking at DICAM as architecture for these functions
Technology Transfer Exposure Opportunity • NSWC (Mike Masters) is technology transfer target for Quorum • Annual demonstration in September • Security and intrusion tolerance are of interest • Willing to discuss inclusion of HACQIT in demonstration – leverages Quorum technologies • Need to start coordination planning in April
Issues • Looking for interested potential users for feedback (PACOM, NSWC, other?) • Need help in getting ACOA server software • Reuse of research prototypes
Backup 1:Towards a Formal Methodology for Responding to Integrity and DOS Intrusions Jim Just - Teknowledge Karl Levit, Jeff Rowe, Marcus Tylutki, Nicole Carlson, Steven Templeton, Mark Heckman -- UCD
Paradigm for Responding to Integrity Intrusion REPEAT UNTIL ATTACK SYMPTOMS DISAPPEAR Detect integrity violation on a critical file Switchover to backup server; restore prior version of critical file on primary Use Jigsaw model to determine possible causes and sources of attack Deploy sensors and responders as determined by model If attack persists block with responders
Paradigm for Responding to Internal DOS Attack REPEAT UNTIL ATTACKSYMPTOMS DISAPPEAR Detect denial of service violation on primary server Switchover to newly created process on server; kill process causing denial of service Use Jigsaw model to determine possible causes and sources of attack Deploy sensors and responders on server and on firewall as determined by model If attack persists block with responders
Connection Spoofing Attack • Multiple stage • Attacker establishes a TCP connection to a host (server) H exploiting a trust relationshiop (through .rhosts) between H and some other host H1. • Attack involves • denial of service on H1 • Connection number guessing • Planting a trojan horse on H • Many variants are possible • Detection is assumed to occur when .rhosts file on H is erroneously modified
Scenario Attacks: an example sarte spock RSH trust relation: sarte trusts kafka, will execute programs for kafka kafka
Scenario Attacks: an example sarte spock kafka (1) Spock launches synflood attack against kafka
Scenario Attacks: an example sarte spock kafka (2) Spock probes sarte for starting sequence number on RSH port
Scenario Attacks: an example sarte spock (3) Spock sends syn packet to TCP/RSH on sarte w/ source forged to be kafka. kafka
Scenario Attacks: an example sarte spock (4) Sarte sends syn/ack to kafka kafka
Scenario Attacks: an example sarte spock (5) Kafka drops packet due to DoS kafka