Malware Reverse Engineering for Beginners: From Zero to Real Sample Analysis

Malware reverse engineering sounds intimidating. In reality, even a beginner with the right tools and methodology can extract meaningful intelligence from malware samples — understand what a binary does, extract network indicators, find hardcoded C2 addresses, and determine if a sample is a known family or something novel. This guide teaches you to work safely with malware from day one and build up to real sample analysis.

Setting Up a Safe Analysis Environment

Never analyze malware on your host machine. Always use an isolated virtual machine with network access disabled or redirected to a monitoring environment. A compromised analysis environment means your entire system could be affected.

# Recommended VM setup:
# 1. Install VirtualBox or VMware Workstation
# 2. Create Windows 10/11 VM (most malware targets Windows)
# 3. Take a clean snapshot BEFORE any analysis
# 4. Disable real network — use Host-Only networking
# 5. Install FlareVM (automated malware analyst toolkit)

# Install FlareVM on your Windows analysis VM:
# Download installer from: github.com/mandiant/flare-vm
# Run in PowerShell (as Administrator):
(New-Object net.webclient).DownloadFile('https://raw.githubusercontent.com/mandiant/flare-vm/main/install.ps1', "$env:USERPROFILEDesktopinstall.ps1")
Unblock-File .install.ps1
Set-ExecutionPolicy Unrestricted -Scope Process
.install.ps1
# This installs: IDA Free, Ghidra, x64dbg, pestudio, FLOSS, Wireshark, and 100+ other tools

Phase 1: Static Analysis — Before Running Anything

Static analysis means examining a binary without executing it. This is safe and gives you initial intelligence about what the sample might do.

# Step 1: Get the file hash — check if it is a known sample
md5sum malware.exe
sha256sum malware.exe
# Search hash on VirusTotal: virustotal.com
# Search on MalwareBazaar: bazaar.abuse.ch

# Step 2: Determine file type (never trust the extension)
file malware.exe
exiftool malware.exe  # metadata

# Step 3: Strings extraction — readable text inside the binary
strings malware.exe
strings -n 8 malware.exe | grep -E "http|.exe|cmd|powershell|HKEY"
# FLOSS (better strings tool — finds obfuscated strings too):
floss malware.exe
# Step 4: PE header analysis (for Windows executables)
# Use pestudio (GUI tool from FlareVM) OR:
python3 -c "
import pefile
pe = pefile.PE('malware.exe')
print('Imports:')
for entry in pe.DIRECTORY_ENTRY_IMPORT:
    print('  DLL:', entry.dll.decode())
    for imp in entry.imports:
        if imp.name:
            print('    ', imp.name.decode())
"
# Key suspicious imports:
# VirtualAlloc, WriteProcessMemory, CreateRemoteThread → process injection
# WinExec, CreateProcess → executing commands
# RegOpenKey, RegSetValue → registry persistence
# InternetOpen, HttpSendRequest → network communication

Phase 2: Dynamic Analysis — Running It Safely

Dynamic analysis means running the malware in your controlled environment and monitoring what it does. You need monitoring tools running BEFORE you execute the sample.

# Tools to have running BEFORE executing malware:
# - Process Monitor (procmon) — tracks file, registry, network events
# - Wireshark — captures all network traffic
# - Regshot — takes registry snapshots before and after
# - Process Hacker — live process and memory viewer

# Open source automated analysis: Cuckoo Sandbox
# Cuckoo runs the malware and generates a full report automatically
docker run -it -p 8080:8080 remnux/cuckoo

# Or use ANY.RUN (online, free tier):
# Upload sample at any.run — interactive malware sandbox in browser

# After execution, look for:
# - New autorun registry keys (persistence):
#   HKCUSoftwareMicrosoftWindowsCurrentVersionRun
#   HKLMSoftwareMicrosoftWindowsCurrentVersionRun
# - Files dropped in: %TEMP%, %APPDATA%, C:WindowsSystem32
# - Network connections: C2 IP addresses, domains contacted
# - Scheduled tasks created

Phase 3: Code Analysis with Ghidra

# Open Ghidra (free from NSA, included in FlareVM):
# 1. File → New Project → Import file (your malware sample)
# 2. Auto-analysis runs → decompiled C-like code appears

# Key techniques in Ghidra:
# Find the entry point: search for "main" or "WinMain"
# Follow function calls from main
# Look at string cross-references:
#   Right click a string → References → Show References to

# Common obfuscation you will see:
# XOR encryption: bytes XORed with a key to hide strings
# Example decompiled code:
# for(i=0; i

Where to Get Safe Malware Samples for Practice

  • MalwareBazaar (bazaar.abuse.ch) — Daily uploads of fresh malware samples with tags and hashes
  • theZoo (github.com/ytisf/theZoo) — Repository of historic malware with documentation
  • VirusShare (virusshare.com) — Large collection, requires registration
  • Hybrid Analysis (hybrid-analysis.com) — Submit or download samples, see existing analysis reports
  • CAPE Sandbox — Open source sandbox with malware family extraction

Learning Resources for Reverse Engineering

  • Practical Malware Analysis (Sikorski and Honig) — the essential textbook for malware analysts
  • OpenSecurityTraining2 (ost2.fyi) — free, rigorous x86/x64 assembly and RE courses
  • Malware Unicorn workshops (malwareunicorn.org) — free hands-on workshops with sample files
  • Dr. Josh Stroschein (YouTube) — excellent free malware analysis walkthroughs of real samples