OSINT for Beginners: How Attackers Research Targets (and How to Protect Yourself)

OSINT — Open Source Intelligence — is the practice of collecting information from publicly available sources. Attackers use it to research targets before launching attacks. Security professionals use it to understand their own exposure. This guide shows you what attackers can find about you and how to minimize it.

What Is OSINT?

Open source doesn’t mean “open source code” — it means publicly available. OSINT sources include:

Social media profiles
Company websites and job postings
DNS and WHOIS records
Public government records
Search engines and data broker sites
LinkedIn profiles
GitHub repositories
Old breached databases

Why Attackers Do OSINT First

Before attacking a company, skilled attackers spend significant time in reconnaissance. A targeted phishing email that mentions your project management tool, your colleague’s name, and your company’s current challenges is far more convincing than a generic phish.

What attackers discover through OSINT:

Employee names, roles, and email formats
Technology stack (from job postings: “experience with AWS, Kubernetes, PostgreSQL required”)
Organizational structure
Exposed credentials in old breach dumps
Subdomains and internet-facing systems
Sensitive files accidentally published to GitHub

OSINT Techniques and Tools

Google Dorking

Advanced Google search operators find information that simple searches miss.

# Find login pages on a specific site:
site:yourcompany.com inurl:login OR inurl:admin OR inurl:portal

# Find PDFs accidentally exposed:
site:yourcompany.com filetype:pdf "confidential"

# Find exposed configuration files:
site:github.com "yourcompany.com" "password" OR "api_key" OR "secret"

# Find subdomains:
site:*.yourcompany.com

# Find exposed directories:
site:yourcompany.com intitle:"index of"

Shodan: The Internet of Everything Search Engine

# Shodan indexes internet-connected devices and their services
# Free account at shodan.io

# Find all devices belonging to a company (using their IP range):
org:"Your Company Name"

# Find exposed webcams:
product:webcam city:"New York"

# Find vulnerable services:
vuln:CVE-2021-44228      # Log4Shell vulnerable systems

# Find exposed databases:
port:27017 -authentication   # MongoDB with no auth
port:6379 -requirepass       # Redis with no auth

# Command line (pip install shodan):
shodan search "org:'Acme Corp'" --fields ip_str,port,transport,product

Email Discovery

# Find email addresses associated with a domain:
# hunter.io — free tier gives 25 searches/month
# phonebook.cz — free email discovery
# theHarvester tool:

theHarvester -d targetcompany.com -b all
# Searches: Google, Bing, LinkedIn, Shodan, VirusTotal, etc.
# Returns: emails, hosts, IPs, URLs

DNS Reconnaissance

# Enumerate subdomains:
# Sublist3r:
python sublist3r.py -d example.com

# Amass (more powerful):
amass enum -d example.com

# Manual DNS lookups:
nslookup -type=MX example.com    # Mail servers
nslookup -type=TXT example.com   # SPF, DKIM, other records
nslookup -type=NS example.com    # Name servers

# Check for DNS zone transfer (misconfiguration):
dig axfr @ns1.example.com example.com
# Should return "Transfer failed" — if it transfers, it's a vulnerability

GitHub Secret Scanning

# Developers accidentally commit credentials to public repositories
# Trufflehog scans for secrets in git history:
pip install trufflehog3
trufflehog3 https://github.com/target-org/target-repo

# GitHub's own secret scanning (for repo owners):
# Settings > Code security and analysis > Secret scanning

# Things commonly found in public repos:
# AWS credentials (aws_access_key_id, aws_secret_access_key)
# API keys (Stripe, Twilio, SendGrid)
# Database connection strings
# Private SSH keys

Doing an OSINT Assessment on Yourself

# Step 1: Google yourself
# "your full name"
# "your email address"
# "your phone number"

# Step 2: Check data broker sites and opt out:
# spokeo.com, whitepages.com, intelius.com, BeenVerified
# Search for yourself and use their opt-out forms

# Step 3: Check if your email was breached:
# haveibeenpwned.com

# Step 4: Check what your company exposes:
# Google: site:yourcompany.com
# Shodan: org:"Your Company"
# Check GitHub for your domain: site:github.com "yourcompany.com" password

# Step 5: Audit your social media exposure:
# LinkedIn: what technology does your profile reveal?
# Does your profile help attackers build a spear phishing email?

Protecting Your Personal Information

Opt out of data brokers — Use a service like DeleteMe or manually opt out of Spokeo, Whitepages, etc.
Use privacy settings on social media — Control who sees your personal details, location, and relationships
Use aliases for non-essential signups — SimpleLogin or AnonAddy create throwaway email addresses
Be careful what you share on LinkedIn — Exact job titles, technologies used, project names all help attackers
Private WHOIS for your domain — Domain registrars offer WHOIS privacy for a few dollars per year

Wrap Up

OSINT is the first step in almost every targeted attack. Understanding what attackers can find before they find it — and systematically reducing your exposure — is a proactive security strategy called “attack surface reduction.” Run an OSINT assessment on your organization annually, and treat the findings as a remediation priority list.