When Something Doesn’t Look Right in Logs: A Practical Guide to Log Analysis

The Question Everyone Asks at 3 AM

“Why does this IP talk to that IP?”

It’s 3 AM. You’re on-call. A log entry pops up that doesn’t match anything in your baseline. The server that usually talks to three internal services is suddenly making outbound connections to a foreign IP. Your gut says something is wrong. Your question is simple: is it, or is it just noise?

This is the heart of log analysis. Not the tools. Not the dashboards. The question itself.

Let me walk you through how to answer it.

Step 1: Establish What “Normal” Looks Like

You can’t detect anomalies without a baseline. This isn’t optional.

What to Track

For every system, know:

Typical outbound connections — which IPs, which ports, which protocols
Typical connection frequency — how many connections per hour/day
Typical data volume — how much data flows in/out
Typical timing — when connections happen (business hours vs. off-hours)
Typical users/services — who is connecting to what

Quick Baseline Script

#!/bin/bash
# baseline_connections.sh — Track outbound connections for 24 hours

echo "Timestamp,Source IP,Source Port,Dest IP,Dest Port,Protocol,Bytes" > baseline.csv

# For Linux (requires ss or netstat)
while true; do
    ss -tunapo | awk '{print strftime("%Y-%m-%d %H:%M:%S"), $5, $6, $7, $8}' >> baseline.csv
    sleep 300  # Every 5 minutes
done &

Run this for a week. You now have a baseline. Anything outside it is suspicious.

Step 2: The “Why Does This IP Talk to That IP?” Investigation

Let’s say you see this in your logs:

2026-06-07 03:14:22 web-server-01 10.0.1.50:44832 -> 185.220.101.42:443 HTTPS
2026-06-07 03:14:23 web-server-01 10.0.1.50:44833 -> 185.220.101.42:443 HTTPS
2026-06-07 03:14:24 web-server-01 10.0.1.50:44834 -> 185.220.101.42:443 HTTPS

First Questions to Ask

Is this IP known? Run it through threat intelligence:

# Use VirusTotal (free tier)
curl -s "https://www.virustotal.com/api/v3/ip_addresses/185.220.101.42" \
  -H "x-apikey: YOUR_KEY"
   
# Or use Shodan (free tier)
curl -s "https://api.shodan.io/shodan/host/185.220.101.42?key=YOUR_KEY"

Is the timing normal? 3 AM is not a normal time for a web server to make outbound HTTPS connections.
Is the frequency normal? Three connections in three seconds to the same IP is unusual for a web server.

What service is making the connection?

# Find the process
lsof -i :443 -p $(pgrep -f "nginx|apache|node")
   
# Or use ss with process info
ss -tunapo | grep 185.220.101.42

Step 3: Common Anomaly Patterns

Here are the patterns I’ve seen in real incidents. Learn them.

Pattern 1: Data Exfiltration

Normal:  10.0.1.50 -> 93.184.216.34:443 (legitimate CDN)
Anomaly: 10.0.1.50 -> 185.220.101.42:443 (known Tor exit node)

What to look for:

Connections to known Tor exit nodes
Unusual data volumes (large uploads at odd hours)
Connections to IPs in unexpected geolocations

Pattern 2: C2 Communication

Normal:  10.0.1.50 -> 13.107.42.14:443 (Microsoft update)
Anomaly: 10.0.1.50 -> 45.33.32.156:8443 (unusual port)

What to look for:

Connections to unusual ports (not 80, 443, 53)
Beaconing behavior (regular intervals to the same IP)
DNS queries to unusual domains

Pattern 3: Lateral Movement

Normal:  10.0.1.50 -> 10.0.1.100:22 (SSH to internal server)
Anomaly: 10.0.1.50 -> 10.0.2.100:22 (SSH to unexpected subnet)

What to look for:

SSH/RDP to unexpected internal hosts
Connections between subnets that don’t normally communicate
New services appearing on internal hosts

Pattern 4: DNS Tunneling

Normal:  DNS query to dns.google (8.8.8.8)
Anomaly: DNS query to suspicious-domain.xyz (unknown resolver)

What to look for:

DNS queries to unexpected resolvers
Unusually long DNS queries (data in the domain name)
High volume of DNS queries from a single host

Step 4: Tools for Log Analysis (All Free)

Splunk (Free Up to 500MB/day)

index=* "185.220.101.42" | stats count by src_ip, dest_ip, dest_port

Elastic Stack (Free, Open Source)

{
  "query": {
    "match": {
      "dest_ip": "185.220.101.42"
    }
  }
}

Zeek (Open Source Network Analysis)

# Analyze network logs
zeek -r capture.pcap log_connections

# Look for anomalies
zeek -r capture.pcap log_http | grep "185.220.101.42"

Suricata (Open Source IDS)

# Run Suricata on your network traffic
suricata -c /etc/suricata/suricata.yaml -r capture.pcap

# Check alerts
cat /var/log/suricata/eve.json | grep "185.220.101.42"

Step 5: Automate Detection

Don’t rely on manual investigation. Automate the patterns above.

Simple Anomaly Detection Script

#!/usr/bin/env python3
# anomaly_detector.py — Simple log anomaly detector

import csv
from datetime import datetime, timedelta
from collections import defaultdict

# Load baseline (from baseline.csv)
baseline = defaultdict(set)  # {src_ip: {dest_ip, dest_port}}

with open('baseline.csv', 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        src_ip = row['Source IP']
        dest_ip = row['Dest IP']
        dest_port = row['Dest Port']
        baseline[src_ip].add((dest_ip, dest_port))

# Load current logs
current_connections = defaultdict(set)

with open('current_logs.csv', 'r') as f:
    reader = csv.DictReader(f)
    for row in reader:
        src_ip = row['Source IP']
        dest_ip = row['Dest IP']
        dest_port = row['Dest Port']
        current_connections[src_ip].add((dest_ip, dest_port))

# Find anomalies
for src_ip in current_connections:
    new_connections = current_connections[src_ip] - baseline.get(src_ip, set())
    if new_connections:
        print(f"ANOMALY: {src_ip} connecting to new destinations:")
        for dest_ip, dest_port in new_connections:
            print(f"  {src_ip} -> {dest_ip}:{dest_port}")

Run this every hour. It’ll flag anything that breaks your baseline.

The Key Insight

Log analysis isn’t about tools. It’s about knowing what normal looks like and recognizing when something breaks that pattern.

Your baseline is your most important security asset. Build it. Maintain it. Respect it.

What’s Next

In the next post, I’ll cover building a home lab for AI security testing — how to set up a safe environment for learning, practicing, and testing AI security tools without touching production.

This post is part of the Cyber-AI initiative — free, open-source cybersecurity and AI education for everyone.

Found this useful? Share it with someone who’s on-call. They’ll thank you at 3 AM.

The Question Everyone Asks at 3 AM

Step 1: Establish What “Normal” Looks Like

What to Track

Quick Baseline Script

Step 2: The “Why Does This IP Talk to That IP?” Investigation

First Questions to Ask

Step 3: Common Anomaly Patterns

Pattern 1: Data Exfiltration

Pattern 2: C2 Communication

Pattern 3: Lateral Movement

Pattern 4: DNS Tunneling

Step 4: Tools for Log Analysis (All Free)

Splunk (Free Up to 500MB/day)

Elastic Stack (Free, Open Source)

Zeek (Open Source Network Analysis)

Suricata (Open Source IDS)

Step 5: Automate Detection

Simple Anomaly Detection Script

The Key Insight

What’s Next

Share this:

📚 Related Posts