$ cd /home/
← Back to Posts
Hunting Exposed Secrets in GitHub Repos: My Neovim-Powered Workflow

Hunting Exposed Secrets in GitHub: My Neovim-Powered Workflow

Finding secrets in source code is one of those tasks that sounds simple but has enough edge cases and tooling decisions to consume an entire afternoon if you're not methodical about it. I've been doing this kind of work — both proactively scanning our own org's repos and doing it as part of security assessments — for a few years now, and I've settled into a workflow that I actually enjoy using.

The core tools are trufflehog and gitleaks. Neovim is my editor of choice, and I've leaned into it for triaging results, annotating findings, and assembling reports. This post walks through the full workflow end to end, including my Neovim config snippets that make the reporting part less painful.


Why Two Tools?

Fair question. The short answer is that trufflehog and gitleaks have different detection approaches, and using both increases coverage.

Trufflehog uses entropy analysis and regular expression matching, but its killer feature is that it scans the full git history, not just the current state of the repo. Secrets that were committed and then deleted are still in the git object store and fully recoverable. Trufflehog finds them. It also has a growing library of verified detectors — it'll actually try to call the API with the found credential to confirm it's live.

Gitleaks is fast, configurable, and excels at custom pattern matching. Its TOML-based config lets you define your own regex rules for internal secrets formats (internal service tokens, custom API key patterns, etc.) that trufflehog doesn't know about.

Running both takes maybe twice as long but produces a more complete picture. On a large monorepo, triage time dominates anyway, so the extra scan time is worth it.


Setting Up the Tools

terminal
bash
# Trufflehog — Go binary, or via brew
brew install trufflehog

# Gitleaks
brew install gitleaks

# Verify
trufflehog --version
gitleaks version

For GitHub scanning, you'll want a GitHub token with repo scope (or public_repo for public-only work):

terminal
bash
export GITHUB_TOKEN=ghp_yourtokenhere

Running a Scan

Trufflehog on a Single Repo

terminal
bash
# Scan a local repo's full git history
trufflehog git file://./path/to/repo --json > trufflehog-results.json

# Scan a remote GitHub repo (uses GITHUB_TOKEN for auth)
trufflehog github --repo https://github.com/org/repo --json > trufflehog-results.json

# Scan all repos in a GitHub org
trufflehog github --org myorg --json > trufflehog-org-results.json

# Only verified findings (confirmed live credentials)
trufflehog github --org myorg --only-verified --json > trufflehog-verified.json

The --only-verified flag is gold for triage — if it shows up there, it's a confirmed live credential and it's P1. Start there.

Gitleaks on a Repo

terminal
bash
# Basic scan
gitleaks detect --source ./path/to/repo --report-format json --report-path gitleaks-results.json

# Scan git history (not just working tree)
gitleaks detect --source ./path/to/repo --log-opts="--all" \
  --report-format json --report-path gitleaks-history.json

# Use a custom config for internal patterns
gitleaks detect --source . --config .gitleaks-custom.toml \
  --report-format json --report-path gitleaks-custom.json

Custom config example for internal token patterns:

toml
toml
# .gitleaks-custom.toml
title = "Custom Gitleaks Config"

[[rules]]
id = "internal-service-token"
description = "Internal service authentication token"
regex = '''IST-[a-zA-Z0-9]{32}'''
keywords = ["IST-"]

[[rules]]
id = "internal-api-key"
description = "Internal API key format"
regex = '''IKEY-[0-9]{4}-[a-zA-Z0-9]{24}'''
keywords = ["IKEY-"]

[allowlist]
paths = [
  '''node_modules''',
  '''vendor''',
  '''\.git'''
]

Converting Results to CSV for Reporting

Raw JSON is great for processing but terrible for handing to a dev team or putting in a ticket. I convert to CSV as the first step in my reporting workflow.

terminal
bash
# Convert trufflehog JSON output to CSV
# trufflehog outputs one JSON object per line (NDJSON format)
cat trufflehog-results.json | python3 - <<'EOF'
import json
import csv
import sys

writer = csv.DictWriter(sys.stdout, fieldnames=[
    'detector', 'raw', 'verified', 'file', 'line', 'commit', 'author', 'date', 'repository'
])
writer.writeheader()

for line in sys.stdin:
    line = line.strip()
    if not line:
        continue
    try:
        finding = json.loads(line)
        writer.writerow({
            'detector': finding.get('DetectorName', ''),
            'raw': finding.get('Raw', '')[:80] + '...' if len(finding.get('Raw', '')) > 80 else finding.get('Raw', ''),
            'verified': finding.get('Verified', False),
            'file': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('file', ''),
            'line': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('line', ''),
            'commit': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('commit', '')[:8],
            'author': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('author', ''),
            'date': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('timestamp', ''),
            'repository': finding.get('SourceMetadata', {}).get('Data', {}).get('Git', {}).get('repository', ''),
        })
    except json.JSONDecodeError:
        pass
EOF
terminal
bash
# Convert gitleaks JSON to CSV
cat gitleaks-results.json | python3 -c "
import json, csv, sys
data = json.load(sys.stdin)
if not data:
    sys.exit(0)
w = csv.DictWriter(sys.stdout, fieldnames=['rule','file','line','commit','author','date','secret_preview'])
w.writeheader()
for f in data:
    w.writerow({
        'rule': f.get('RuleID',''),
        'file': f.get('File',''),
        'line': f.get('StartLine',''),
        'commit': f.get('Commit','')[:8],
        'author': f.get('Author',''),
        'date': f.get('Date',''),
        'secret_preview': f.get('Secret','')[:40] + '...'
    })
"

I save these as shell functions in my ~/.zshrc so they're always available.


My Neovim Workflow for Triage

This is where it gets personal. You could do this in VS Code or with a spreadsheet, but Neovim is where I think fastest, so that's where I triage.

Opening and Navigating JSON Results

I use fzf.vim and telescope.nvim for quick navigation. For secret scanning specifically, I have a few utility mappings in my init.lua:

lua
lua
-- ~/.config/nvim/lua/security/secrets.lua
-- Load this module for secret scanning sessions

local M = {}

-- Open trufflehog JSON results and set up navigation
function M.open_findings(filepath)
  vim.cmd('vsplit ' .. filepath)
  -- Set up folding for JSON
  vim.opt_local.foldmethod = 'syntax'
  vim.opt_local.foldlevel = 1
  -- Map jq-formatted output shortcut
  vim.keymap.set('n', '<leader>jq',
    ':%!jq .<CR>',
    { buffer = true, desc = 'Format JSON with jq' })
end

-- Extract all unique files with findings for quick jump
function M.list_affected_files()
  vim.cmd([[
    new
    setlocal buftype=nofile
    r !cat trufflehog-results.json | jq -r '.SourceMetadata.Data.Git.file // empty' | sort -u
  ]])
end

-- Create a triage buffer with findings summary
function M.triage_buffer()
  vim.cmd('new findings-triage.md')
  vim.api.nvim_buf_set_lines(0, 0, -1, false, {
    '# Findings Triage',
    '',
    '## Verified (P1)',
    '',
    '## Unverified - High Confidence (P2)',
    '',
    '## Low Signal / Likely False Positive',
    '',
    '## Remediation Notes',
    '',
  })
end

return M

In my main init.lua:

lua
lua
-- Security scanning keymaps (only loaded when I need them)
vim.keymap.set('n', '<leader>sf', function()
  require('security.secrets').list_affected_files()
end, { desc = 'List files with secret findings' })

vim.keymap.set('n', '<leader>st', function()
  require('security.secrets').triage_buffer()
end, { desc = 'Open triage buffer' })

Quickfix List from JSON Results

The real power move is loading findings into Neovim's quickfix list so I can navigate directly to affected files:

lua
lua
-- ~/.config/nvim/lua/security/quickfix.lua

local function load_gitleaks_to_qf(filepath)
  local file = io.open(filepath, 'r')
  if not file then
    vim.notify('Could not open ' .. filepath, vim.log.levels.ERROR)
    return
  end

  local content = file:read('*all')
  file:close()

  local ok, findings = pcall(vim.json.decode, content)
  if not ok or not findings then
    vim.notify('Failed to parse JSON', vim.log.levels.ERROR)
    return
  end

  local qf_items = {}
  for _, f in ipairs(findings) do
    table.insert(qf_items, {
      filename = f.File or '',
      lnum = f.StartLine or 1,
      col = f.StartColumn or 1,
      text = string.format('[%s] %s', f.RuleID or 'unknown', (f.Secret or ''):sub(1, 60)),
      type = 'E',
    })
  end

  vim.fn.setqflist(qf_items)
  vim.cmd('copen')
  vim.notify(string.format('Loaded %d findings into quickfix', #qf_items))
end

vim.keymap.set('n', '<leader>sg', function()
  load_gitleaks_to_qf('gitleaks-results.json')
end, { desc = 'Load gitleaks findings to quickfix' })

Now I can press <leader>sg, get all findings in the quickfix list, and navigate through them with ]q and [q. This is dramatically faster than reading JSON.

Annotating and Exporting the Final Report

When I'm ready to write up findings, I use a simple template I fill in as I triage:

markdown
markdown
<!-- findings-report-template.md -->
# Secret Scanning Report — {{repo}} — {{date}}

## Executive Summary

Found {{N}} secrets across {{M}} repositories. {{X}} verified live.

## Findings

### FINDING-001

| Field | Value |
|-------|-------|
| Severity | P1 - Verified |
| Type | AWS Access Key |
| File | src/config/aws.js |
| Commit | abc1234 |
| Author | dev@company.com |
| Date Committed | 2024-01-15 |

**Secret Preview:** `AKIA...` (redacted)

**Remediation:**
- [ ] Rotate the key immediately
- [ ] Audit usage in AWS CloudTrail
- [ ] Remove from git history with `git filter-repo`
- [ ] Add pre-commit hook to prevent recurrence

I keep this template in ~/.config/nvim/templates/secret-report.md and have a snippet for it in my LuaSnip config:

lua
lua
-- In snippets/markdown.lua
local ls = require('luasnip')
local s = ls.snippet
local t = ls.text_node
local i = ls.insert_node
local f = ls.function_node

ls.add_snippets('markdown', {
  s('finding', {
    t({ '### FINDING-' }), i(1, '001'),
    t({ '', '', '| Field | Value |', '|-------|-------|' }),
    t({ '', '| Severity | ' }), i(2, 'P1 - Verified'), t(' |'),
    t({ '', '| Type | ' }), i(3, 'API Key'), t(' |'),
    t({ '', '| File | ' }), i(4, 'path/to/file'), t(' |'),
    t({ '', '| Commit | ' }), i(5, 'abc1234'), t(' |'),
    t({ '', '', '**Remediation:**', '- [ ] Rotate immediately', '- [ ] Audit access logs', '- [ ] Remove from history', '' }),
  }),
})

The Full Pipeline as a Script

Putting it all together into a single reproducible script:

terminal
bash
#!/usr/bin/env bash
# secret-scan.sh
# Usage: ./secret-scan.sh <github-org> <output-dir>

ORG=$1
OUTPUT_DIR=${2:-./scan-results}
TIMESTAMP=$(date +%Y%m%d-%H%M%S)

mkdir -p "$OUTPUT_DIR"

echo "[*] Starting trufflehog scan of org: $ORG"
trufflehog github \
  --org "$ORG" \
  --json \
  --only-verified \
  2>/dev/null > "$OUTPUT_DIR/trufflehog-verified-$TIMESTAMP.json"

echo "[*] Starting gitleaks scan"
# For org-wide, you'd clone repos first; this example is single-repo
gitleaks detect \
  --source . \
  --log-opts="--all" \
  --report-format json \
  --report-path "$OUTPUT_DIR/gitleaks-$TIMESTAMP.json" \
  2>/dev/null

echo "[*] Converting to CSV"
# trufflehog CSV
cat "$OUTPUT_DIR/trufflehog-verified-$TIMESTAMP.json" | \
  jq -r '[.DetectorName, .Verified, (.SourceMetadata.Data.Git.file // ""), (.SourceMetadata.Data.Git.commit // "")[:8], (.SourceMetadata.Data.Git.author // ""), (.Raw // "")[:60]] | @csv' \
  > "$OUTPUT_DIR/trufflehog-verified-$TIMESTAMP.csv"

echo "[+] Scan complete. Results in $OUTPUT_DIR"
echo "    Verified findings: $(wc -l < "$OUTPUT_DIR/trufflehog-verified-$TIMESTAMP.json")"
ls -lh "$OUTPUT_DIR"

A Word on False Positives

Both tools generate false positives. High-entropy strings that look like secrets but are UUIDs, test fixtures, or encoded data. Part of triage is filtering these out. I keep a .gitleaks-allowlist.toml in my scanning toolkit with patterns for known false positive formats, and I use trufflehog's --only-verified flag aggressively for initial triage.

Don't let the false positive rate discourage you. Even a noisy scan that's 50% false positives is finding real secrets the other 50% of the time.


The Takeaway

The tools exist, they're free, and they're good. The workflow I've described — trufflehog for history + verified detection, gitleaks for custom patterns, CSV conversion for reporting, Neovim for triage — is reproducible and works for everything from a single repo audit to scanning an entire GitHub org.

Clone this workflow, adapt the Neovim config to your setup, and run it against your own repos today. You might be surprised what you find. And if you are surprised — that's the point. Better you find it than someone else does.