Files
homelab-agents/VPS-SSH-KEY-SETUP.md
root 7a748fb8ac Add troubleshooting for wrong SSH_AUTH_SOCK socket issue
Addresses the issue where multiple ssh-agent processes run and the shell
uses /tmp/ssh-* socket instead of systemd's socket.

Improvements:
- Enhanced diagnostic script detects wrong socket usage automatically
- New troubleshooting section for "Multiple ssh-agent processes running"
- Step-by-step fix to clean up ~/.bashrc and use correct socket
- Verification steps to confirm fix

Fixes the symptom: 12 agents running, SSH_AUTH_SOCK pointing to /tmp
instead of ${XDG_RUNTIME_DIR}/ssh-agent.socket

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2025-12-14 07:43:46 +00:00

8.8 KiB

SSH Key Setup for New VPS

Quick guide to add your SSH private key to a new VPS and configure it for Gitea with guaranteed persistence.

Step 1: Create .ssh Directory

mkdir -p ~/.ssh
chmod 700 ~/.ssh

Step 2: Add Private Key

Get your private key from 1Password and create the file:

cat > ~/.ssh/id_ed25519 << 'KEY'
[PASTE YOUR ENTIRE PRIVATE KEY HERE - from -----BEGIN to -----END]
KEY

Step 3: Set Correct Permissions

chmod 600 ~/.ssh/id_ed25519

Step 4: Start SSH Agent

eval "$(ssh-agent -s)"

You should see: Agent pid XXXXX

Step 5: Add Key to Agent

ssh-add ~/.ssh/id_ed25519

You should see: Identity added

Step 6: Test Connection

ssh -T git@100.120.125.113

Should respond with authentication success message.

The most reliable way to keep SSH agent running is with a systemd user service.

Quick Setup (Copy-Paste Method)

# Create systemd service directory
mkdir -p ~/.config/systemd/user

# Create ssh-agent service
cat > ~/.config/systemd/user/ssh-agent.service << 'SERVICEEOF'
[Unit]
Description=SSH key agent

[Service]
Type=simple
Environment=SSH_AUTH_SOCK=%t/ssh-agent.socket
ExecStart=/usr/bin/ssh-agent -D -a $SSH_AUTH_SOCK

[Install]
WantedBy=default.target
SERVICEEOF

# Enable and start the service
systemctl --user enable ssh-agent.service
systemctl --user start ssh-agent.service

# Add to ~/.bashrc
cat >> ~/.bashrc << 'BASHEOF'

# SSH Agent - Use systemd user service
export SSH_AUTH_SOCK="${XDG_RUNTIME_DIR}/ssh-agent.socket"

# Auto-add key on login
if [ -z "$(ssh-add -l 2>/dev/null | grep id_ed25519)" ]; then
    ssh-add ~/.ssh/id_ed25519 2>/dev/null
fi
BASHEOF

# Apply changes
source ~/.bashrc

Why This Works

  • systemd service keeps agent running even after logout
  • Persists across reboots - service auto-starts on login
  • Works with multiple terminals - all use same socket
  • No process hunting - systemd manages the agent lifecycle
  • Clean and simple - one socket location, no guessing

Verify It's Working

# Check service status
systemctl --user status ssh-agent

# Check environment variable
echo $SSH_AUTH_SOCK

# Check loaded keys
ssh-add -l

# Test git connection
ssh -T git@100.120.125.113

Should all work without re-adding the key!

Test Persistence

# Logout and back in
exit
# SSH back in

# Key should still be loaded
ssh-add -l

Alternative Method: ~/.bashrc Only (If systemd unavailable)

If your VPS doesn't support systemd user services, use this fallback:

cat >> ~/.bashrc << 'BASHEOF'

# SSH Agent Persistence (bashrc method)
if [ -z "$SSH_AUTH_SOCK" ]; then
    if pgrep -u "$USER" ssh-agent > /dev/null; then
        export SSH_AUTH_SOCK=$(find /tmp -path "*ssh*" -name "agent.*" -user "$USER" 2>/dev/null | head -1)
    else
        eval "$(ssh-agent -s)" > /dev/null
        echo "$SSH_AUTH_SOCK" > ~/.ssh/agent.sock
    fi
fi

if [ -f ~/.ssh/agent.sock ] && [ -z "$SSH_AUTH_SOCK" ]; then
    export SSH_AUTH_SOCK=$(cat ~/.ssh/agent.sock)
fi

if [ -z "$(ssh-add -l 2>/dev/null | grep id_ed25519)" ]; then
    ssh-add ~/.ssh/id_ed25519 2>/dev/null
fi
BASHEOF

source ~/.bashrc

Note: This method is less reliable - agent may die on full logout.

Troubleshooting

Diagnostic Script

Run this to diagnose issues:

cat > ~/ssh-diag.sh << 'DIAGEOF'
#!/bin/bash
echo "=== SSH Agent Diagnostic ==="
echo ""

AGENT_COUNT=$(pgrep -u "$USER" ssh-agent | wc -l)
EXPECTED_SOCK="${XDG_RUNTIME_DIR}/ssh-agent.socket"

echo "1. SSH_AUTH_SOCK: $SSH_AUTH_SOCK"
echo "2. Expected socket: $EXPECTED_SOCK"
echo "3. Running agents: $AGENT_COUNT"
echo "4. Loaded keys:"
ssh-add -l 2>&1
echo ""
echo "5. Systemd service:"
systemctl --user status ssh-agent 2>&1 | head -5
echo ""
echo "6. Shell RC has SSH code:"
grep -q "SSH Agent" ~/.bashrc && echo "   ✓ Found" || echo "   ✗ Not found"
echo ""

# Detect issues
if [[ "$AGENT_COUNT" -gt 1 ]]; then
    echo "⚠ WARNING: $AGENT_COUNT agents running (should be 1)"
    echo "   Fix: See 'Multiple ssh-agent processes' section"
fi

if [[ "$SSH_AUTH_SOCK" != "$EXPECTED_SOCK" ]]; then
    echo "⚠ WARNING: Using wrong socket!"
    echo "   Current: $SSH_AUTH_SOCK"
    echo "   Should be: $EXPECTED_SOCK"
    echo "   Fix: See 'Multiple ssh-agent processes' section"
fi

if systemctl --user is-active ssh-agent >/dev/null 2>&1; then
    echo "✓ Systemd service is running"
else
    echo "✗ Systemd service NOT running"
    echo "   Fix: systemctl --user start ssh-agent"
fi
DIAGEOF

chmod +x ~/ssh-diag.sh
bash ~/ssh-diag.sh

Common Issues

"Could not open a connection to your authentication agent"

# Check if service is running
systemctl --user status ssh-agent

# If stopped, start it
systemctl --user start ssh-agent

# Then reload shell
source ~/.bashrc

"Permission denied (publickey)"

# Check key permissions
ls -la ~/.ssh/id_ed25519

# Should be: -rw------- (600)
chmod 600 ~/.ssh/id_ed25519

# Try adding key manually
ssh-add ~/.ssh/id_ed25519

Agent running but key not loaded after reboot

# Check if auto-add code is in ~/.bashrc
tail -10 ~/.bashrc | grep "ssh-add"

# If missing, add it:
echo 'if [ -z "$(ssh-add -l 2>/dev/null | grep id_ed25519)" ]; then ssh-add ~/.ssh/id_ed25519 2>/dev/null; fi' >> ~/.bashrc

Systemd service fails to start

# Check journal logs
journalctl --user -u ssh-agent

# Restart service
systemctl --user daemon-reload
systemctl --user restart ssh-agent

Multiple ssh-agent processes running (shell using wrong socket)

If diagnostic shows many agents (e.g., 12) and SSH_AUTH_SOCK points to /tmp/ssh-* instead of ${XDG_RUNTIME_DIR}/ssh-agent.socket:

# 1. Kill all agents and restart systemd service cleanly
pkill -u "$USER" ssh-agent
systemctl --user restart ssh-agent

# 2. Check your runtime directory
echo "Should use: ${XDG_RUNTIME_DIR}/ssh-agent.socket"
echo "Currently using: $SSH_AUTH_SOCK"

# 3. Clean up ~/.bashrc - remove OLD/duplicate SSH agent code
cp ~/.bashrc ~/.bashrc.backup
sed -i '/# SSH Agent/,/fi$/d' ~/.bashrc

# 4. Add clean version
cat >> ~/.bashrc << 'BASHEOF'

# SSH Agent - Use systemd user service
export SSH_AUTH_SOCK="${XDG_RUNTIME_DIR}/ssh-agent.socket"

# Auto-add key on login
if [ -z "$(ssh-add -l 2>/dev/null | grep id_ed25519)" ]; then
    ssh-add ~/.ssh/id_ed25519 2>/dev/null
fi
BASHEOF

# 5. Apply immediately
source ~/.bashrc

# 6. Verify fix
echo "Agents running: $(pgrep -u "$USER" ssh-agent | wc -l)"  # Should be 1
echo "Using socket: $SSH_AUTH_SOCK"  # Should contain XDG_RUNTIME_DIR
ssh-add -l  # Should show your key

The issue happens when old SSH agent code in ~/.bashrc conflicts with the systemd method.

"identity_sign: private key contents do not match public"

This critical error means the public key on Gitea doesn't match your private key.

# Generate correct public key from your private key
ssh-keygen -y -f ~/.ssh/id_ed25519 > /tmp/correct-public-key.pub

# Show it
cat /tmp/correct-public-key.pub

Copy the output (starts with ssh-ed25519 AAAA...), then:

  1. Go to http://100.120.125.113:3000/user/settings/keys
  2. Delete the old/wrong key
  3. Add the correct public key you just generated
  4. Test: ssh -T git@100.120.125.113

Full diagnostic for key mismatch:

#!/bin/bash
echo "=== SSH Key Mismatch Diagnostic ==="

# Generate what public key SHOULD be
ssh-keygen -y -f ~/.ssh/id_ed25519 > /tmp/derived-public-key.pub

echo "=== CORRECT Public Key (copy this to Gitea) ==="
cat /tmp/derived-public-key.pub
echo ""

echo "=== Key Fingerprint ==="
ssh-keygen -lf ~/.ssh/id_ed25519

# Compare with stored public key if exists
if [ -f ~/.ssh/id_ed25519.pub ]; then
    if diff ~/.ssh/id_ed25519.pub /tmp/derived-public-key.pub > /dev/null; then
        echo "✓ Stored .pub file matches private key"
    else
        echo "✗ Stored .pub file WRONG - delete it and use derived key above"
    fi
fi

Save the diagnostic as ~/fix-key-mismatch.sh, run it, and upload the shown public key to Gitea.

After SSH Works

Now you can clone from Gitea without passwords:

git clone git@100.120.125.113:pdm/homelab-agents.git ~/.homelab-agents
git clone git@100.120.125.113:pdm/vps-system-apps.git ~/projects/system-apps

Or use the bootstrap script:

bash <(curl -s http://100.120.125.113:3000/pdm/homelab-agents/raw/branch/main/scripts/bootstrap-agents.sh)

Quick Reference

# Check agent status
systemctl --user status ssh-agent

# Check loaded keys
ssh-add -l

# Manually add key
ssh-add ~/.ssh/id_ed25519

# Test Gitea connection
ssh -T git@100.120.125.113

# Restart agent service
systemctl --user restart ssh-agent

The systemd method should give you truly persistent SSH agent across all sessions, logouts, and reboots!