7 Azure Security Gaps I have Seen in Production (and How to Fix Them)

Published: 3 days ago (December 14, 2025 at 07:59 PM EST)

4 min read

Source: Dev.to

1. Inadequate Network Security Group (NSG) Rules

The Problem

Network Security Groups (NSGs) are the primary network‑level control for Azure VMs. Production environments often contain overly permissive inbound rules such as:

Allow * from 0.0.0.0/0
SSH (22) or RDP (3389) exposed to the public internet

Real‑world example

During a security audit, 12 production VMs were found with SSH open to the internet. NSG flow logs showed 50,000+ failed login attempts in 30 days.

How to Detect

Azure Portal

Network Security Groups → Inbound rules – look for:
- Source: Any or 0.0.0.0/0
- Ports: 22, 3389, 1433, 5432
Review Effective security rules per VM

KQL (NSG Flow Logs)

AzureNetworkAnalytics_CL
| where SubType_s == "FlowLog"
| where SrcIP_s !startswith "10."
  and SrcIP_s !startswith "172."
  and SrcIP_s !startswith "192.168."
| where DestPort_d in (22, 3389)
| summarize Attempts = count() by SrcIP_s, DestPort_d
| order by Attempts desc

The Fix

Immediate

Remove 0.0.0.0/0 access for SSH/RDP
Whitelist trusted office or VPN IPs
Use Azure Bastion for secure remote access
Enable Just‑In‑Time (JIT) VM access via Microsoft Defender for Cloud

Long‑term

Adopt Application Security Groups (ASGs)
Deploy Azure Firewall for centralized filtering
Enforce NSG standards with Azure Policy

2. Missing Azure Backup Policies

The Problem

Many teams assume Azure automatically backs up VMs—it does not. Production workloads have been found without a Recovery Services vault or any backup schedule.

How to Detect

Azure Portal

Recovery Services vaults → Backup items – compare against the VM inventory.

KQL Query

Resources
| where type == "microsoft.compute/virtualmachines"
| project name, resourceGroup
| join kind=leftouter (
    RecoveryServicesResources
    | where type contains "protectedItems"
    | extend vmName = tostring(split(properties.sourceResourceId, "/")[8])
    | project vmName
) on $left.name == $right.vmName
| where isnull(vmName)

The Fix

Create Recovery Services vaults per region.
Define backup policies aligned with your RPO/RTO.
Enable Soft Delete.
Test restores quarterly.
Configure backup alerts via Azure Monitor.

3. Weak Authentication Methods

The Problem

Password‑based SSH and RDP access remains common. Some environments reuse the same password across multiple admin accounts, creating a single point of failure.

How to Detect

Linux

grep -i PasswordAuthentication /etc/ssh/sshd_config
grep -i PubkeyAuthentication /etc/ssh/sshd_config

Windows

Review Azure AD Conditional Access policies.
Check Azure AD sign‑in logs for password‑based logins.

The Fix

Linux

Disable password authentication: PasswordAuthentication no
Enforce SSH key‑based authentication.
Store private keys in Azure Key Vault.
Enable Azure AD Login for Linux VMs.

Windows

Enforce Multi‑Factor Authentication (MFA).
Minimize local admin usage.
Deploy Privileged Access Workstations (PAWs).

4. Unencrypted Data in Transit

The Problem

HTTP endpoints, databases without TLS, and FTP transfers still exist in production, even for sensitive data.

How to Detect

Application Gateway / WAF Logs

AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS"
| where requestUri_s startswith "http://"
| summarize Count = count() by requestUri_s, clientIP_s
| order by Count desc

The Fix

Enforce HTTPS end‑to‑end.
Redirect http:// → https://.
Enable TLS 1.2+ for all database connections.
Replace FTP with SFTP/FTPS.
Manage certificates via Azure Key Vault.

5. Improper Role‑Based Access Control (RBAC)

The Problem

Developers often hold Contributor or Owner rights at the subscription level, violating the principle of least privilege.

How to Detect

authorizationresources
| where type == "microsoft.authorization/roleassignments"
| where properties.roleDefinitionId contains "Owner"
   or properties.roleDefinitionId contains "Contributor"
| project principalId, scope = tostring(properties.scope)

The Fix

Conduct regular RBAC audits.
Remove unnecessary subscription‑level roles.
Create custom roles with minimal permissions.
Use Azure AD Privileged Identity Management (PIM).
Enable access reviews for high‑privilege accounts.

6. Missing Activity Log Alerts

The Problem

Critical changes (e.g., NSG rule updates, VM deletions) occur without any alerts, leaving teams unaware.

How to Detect

Azure Monitor → Alerts – filter by Activity Log and verify that relevant categories are being monitored.

The Fix

Create alerts for:

NSG rule changes
VM creation/deletion
RBAC modifications
Defender for Cloud policy updates
Key Vault access changes

7. Exposed Management and Database Ports

The Problem

Beyond SSH/RDP, database ports and admin interfaces are often exposed to the internet (e.g., SQL, MySQL, PostgreSQL, MongoDB, Redis).

How to Detect

AzureNetworkAnalytics_CL
| where SubType_s == "FlowLog"
| where DestPort_d in (1433, 3306, 5432, 8080, 8443, 27017, 6379)
| where SrcIP_s !startswith "10."
  and SrcIP_s !startswith "172."
  and SrcIP_s !startswith "192.168."
| summarize Count = count() by DestPort_d, DestIP_s
| order by Count desc

The Fix

Close non‑essential ports.
Use Private Endpoints / Private Link for database access.
Deploy Application Gateway with WAF for web‑based admin interfaces.
Leverage Azure Bastion for secure remote management.
Follow Defender for Cloud recommendations.

Conclusion

Azure security is an ongoing process that requires continuous monitoring, regular audits, and proactive remediation. Most gaps stem from missing guardrails rather than missing tools. Leveraging Azure Policy, Defender for Cloud, and Infrastructure as Code can prevent these issues before they reach production.

Quick Security Checklist (≈ 85 minutes)

Audit NSG rules for 0.0.0.0/0 – 5 min
Verify VM backups – 10 min
Review RBAC assignments – 15 min
Check SSH password authentication – 10 min
Enable Activity Log alerts – 20 min
Scan for exposed DB ports – 10 min
Identify HTTP traffic – 15 min

Let’s Discuss

Have you encountered similar Azure security issues in production? What guardrails do you use to prevent them?

1. Inadequate Network Security Group (NSG) Rules

The Problem

Real‑world example

How to Detect

Azure Portal

KQL (NSG Flow Logs)

The Fix

Immediate

Long‑term

2. Missing Azure Backup Policies

The Problem

How to Detect

Azure Portal

KQL Query

The Fix

3. Weak Authentication Methods

The Problem

How to Detect

Linux

Windows

The Fix

Linux

Windows

4. Unencrypted Data in Transit

The Problem

How to Detect

Application Gateway / WAF Logs

The Fix

5. Improper Role‑Based Access Control (RBAC)

The Problem

How to Detect

The Fix

6. Missing Activity Log Alerts

The Problem

How to Detect

The Fix

7. Exposed Management and Database Ports

The Problem

How to Detect

The Fix

Conclusion

Quick Security Checklist (≈ 85 minutes)

Let’s Discuss

Related posts

Effectively Managing AI Agents for Testing

My North Star as an AI Founder (And Why I’m Not Changing It)

How to add a live star history chart to your github readme

If the Same Input Gives Different Results, It’s Not a Decision System

Quick Security Checklist (≈ 85 minutes)