Understanding Computer Viruses: Definition, Mechanisms, and Classification
This chapter delves into the world of computer viruses, defining their characteristics, such as self-replication and parasitism. It details the components of a virus, including infection mechanisms and payloads, highlighting the distinction between viruses and logic bombs. The chapter categorizes viruses based on their targets and concealment strategies, explaining how they can infect boot sectors, executable files, and data files. Additionally, it explores different infection methods, such as prepending, appending, and overwriting within infected files.
Understanding Computer Viruses: Definition, Mechanisms, and Classification
E N D
Presentation Transcript
Virus Definition • Recall definition from Chapter 2… • Self-replicating: yes • Population growth: positive • Parasitic: yes • When executed, tries to replicate itself into other executable code • So, it relies in some way on other code • Does not propagate via a network
Virus • 3 parts to a virus • Infection mechanism --- how it spreads • Multipartite virus uses multiple means • Trigger --- decides when/how to deliver payload • Payload --- what it does other than spread • Either intentional or accidental
Virus Pseudocode • Without infection mechanism… • It’s not a virus, it’s a logic bomb • But trigger and payload are optional • Generic virus pseudocode def virus(): infect() if trigger() is true: payload()
Infection Pseudocode • Targets must be “local” • Don’t select already infected targets • Can be a double edged sword def infect(): repeat k times: target = select_target() if no target: return infect_code(target)
Virus Classification • Possible to classify in many ways • Here, we classify in 2 ways: • Target • What/where does the virus infect? • Concealment strategy • What does it do to remain undetected?
Classification by Target • Briefly consider 3 cases • Boot-sector infectors • Executable file infectors • Data file infectors • Macro viruses
Boot Sequence • Generic boot sequence • Power on • ROM-based instructions run • Self-test, device detection, initialization • Boot device IDed, boot block read from it • Control transferred to the loaded code --- this step known as primary boot
Boot Sequence Continued • Code loaded in primary boot step loads larger, fancier program • This is secondary boot • Secondary boot loads/runs OS kernel
Boot Sector Infector • Why infect boot sector? • A boot-sector infector (BSI) • Infects by copying itself to boot block • May copy boot block elsewhere • Could be tricky, require lots of code • So a fixed “safe” location chosen • Different viruses may use same “safe” location (e.g., Stoned and Michelangelo)
Boot Sector Infector • BSI once popular, not so much now • Why? • Machines don’t reboot so often • Much harder to infect, due to better defenses
File Infectors • OS views some files as executable • Like “exe” and similar • Files that can be run by a command-line "shell" also considered executable • Batch files, shell scripts, … • File infector --- infects executable file • Exe, shell code, consider executable • Binary executable is most common target
File Infectors • Two main issues… • Where to put the virus within file? • How to execute the virus when infected file is run? • Consider these two (interrelated) questions in next few slides
Beginning of File • Older exe formats (e.g., .COM) treat entire file as chunk of code and data • Entire file loaded into memory • Execution starts by jumping to the beginning of the loaded file • Can put virus at start of such a file • That is, prepend the virus code
End of File • Append a virus (even easier?) • Then how does virus get executed? • Some possibilities… • Replace first line(s) with a jump to viral code --- save overwritten code • Later, transfer control back to code • How to do this?
End of File • How to transfer control back to code? • Run saved instructions in saved location • Restore the infected code back to its original state and run it • Many exe file formats specify start location in file header • If so, virus can change start location to point to its own code and jump to the original start location when done
Overwritten into File • Virus places itself atop original code • Can avoid changes in file size • Easy for virus to get control • But… overwriting code will break the original code • Making virus easier to discover • Is it possible to overwrite without breaking the code?
Overwritten into File • Smart ways to overwrite? • Overwrite repeated data • May be trickier to execute virus • Save overwritten data (like BSI) • Use over-allocated space in a file • Compress code to make space • For these to work, virus must be small
Merged with File • Could try to merge virus with target • I.e., intermixing virus/target code • Difficult • So, it’s “rarely seen” • But, supposedly, Zmist does this • So, apparently it is possible • That’s impressive…
Not in File • Companion virus --- separate from, but naturally executed before target • No modification to infected code • May take advantage of process used by OS or shell to search for exe files • Like a Trojan horse but it’s a virus… • …since it’s self-replicating
Companion Virus • Virus is earlier in the search path • Same name as the target file, almost… • E.g., MS-DOS searches for “foo” by • Look for foo.com • Look for foo.exe • Look for foo.bat • If the target file is a foo.exe, companion virus is in file foo.com
Companion Virus • Windows registry associates file types with applications • Can modify registry so that companion virus runs instead of exe • Then companion can transfer control to the corresponding exe • In effect, all exes infected at once!
Companion Virus • ELF file format used on recent Unix’s • Has "interpreter" specified in each exe file header • Points to run-time linker • Companion virus can replace the run-time linker • As above, effect is that all exe files infected at once
Companion Virus • Companion viruses possible in GUI • App’s icon can be overwritten with the icon for the companion virus • When a user clicks on “app” icon… • Companion virus runs instead
Macro Virus • Some apps allow data files to have macros embedded in them • Macros are short snippets of “code” interpreted by the application • Such a languages often provide enough functionality to write a virus
Macro Virus • Macros often run automatically when file is loaded • Easy to write compared to low-level code • First proof of concept in 1989 • Hit “mainstream” in 1995 • Virus known as Concept • Targeted Microsoft Word (of course) • Installed in “global macros” • Infected all edited documents
Macro Virus: Concept • Targeted Word Docs • AutoOpen macro --- runs automatically when file opened • How you get the virus from infected file • FileSaveAs --- when “file save as” selected from menu • So the virus can infect other docs
Classification by Concealment Strategy • Most viruses try to hide • Why? • So, how do they hide? • Encryption • Polymorphism • Etc., etc. • Yet another way to classify viruses..
No Concealment • Do nothing to hide • This is easiest for virus writer… • …but also easiest to detect, analyze
Encryption • Why encrypt? • Virus body is “hidden” from view • In particular, the signature is hidden • Distinguish between strong encryption and obfuscation • Viruses usually only obfuscated • Very weak encryption
Encryption • How to encrypt? • Let me count the ways… • Simple encryption • Rotate, increment, negate, etc. • Static encryption key • E.g., XOR fixed byte to all bytes • Variable encryption key • Like static, but key changes
Encryption (Continued) • Substitution cipher • Permute the bytes • Could be via lookup table • Could even have multiple ciphertexts decrypt to same plaintext • Strong encryption • DES, AES, RC4, etc. • Might use crypto libraries
Stealth • Tries to hide the infection • Not just hide the virus signature • Examples of stealth techniques • Change timestamp and/or other file info to pre-infection values • Intercept I/O calls to hide presence (in MS-DOS user-accessible interrupts) • Hijack secondary boot loader
Stealth • Stealth viruses “overlap” rootkits • Rootkit --- installed on compromised machine so attacker can use it • Stealth is critical to rootkit success • Some malware use rootkits • For example, Ryknos Trojan hid itself using a rootkit designed for DRM
Reverse Stealth Virus • What is “reverse stealth”? • Make everything look infected! • Why is this malicious? • Damage may be done by AV software trying to disinfect
Oligomorphism • Oligomorphic or semi-polymorphic • Code is encrypted • Decryptor code is morphed • But not too many different decryptors • For example • Whale had 30 different decryptors • Memorial had 96 decryptors • How to detect?
Polymorphism • Like oligomorphic, but lots more decryptors • Essentially, an infinite number • For example • Tremor has almost 6 billion decryptors • So, AV software cannot have a signature for each decryptor
Polymorphism • 2 problems for polymorphic writer… • How to generate decryptors? • Use a mutation engine • Engine is part of encrypted virus • How to detect previous infections? • Data “hiding”: timestamp, file size, file system features, external storage, … • “Inoculate” system by faking infection?
Mutation Engine • Equivalent instruction substitution • One or more instructions • Instruction reordering • Register swap • Reorder data • Spaghetti code • Insert junk code • Run-time code modification/generation
Mutation Engine • Subroutine permutation • DIY virtual machine • Concurrency --- threads • Inlining/outlining • “Threaded” code --- not threads Jump directly from one subroutine to another, without returning • Subroutine interleaving
Mutation Engine • Many, many other possibilities • Possible overlap with optimizing compilers? • Seems more like de-optimizing…
Equivalent Instructions • All of these lines set register r1 to 0 clear r1 xor r1,r1 and 0,r1 move 0,r1
Concurrency Example r1 = 12 start thread T r2 = 34 => r1 = 12 r3 = rl + r2 wait for signal r3 = r1 + r2 ... T: r2 = 34 send signal exit thread T
Concurrency • Aside: Concurrency may be very effective anti-reversing technique • Use multiple threads • Intentional deadlock • “Junk” threads • Described in masters project: • Improved software activation using multithreading
Mutation • Mutation also can be used for good • Makes reverse engineering attacks more difficult • Make software more “diverse”