7 Azure Security Gaps I have Seen in Production (and How to Fix Them)

Published: (December 14, 2025 at 07:59 PM EST)
4 min read
Source: Dev.to

Source: Dev.to

1. Inadequate Network Security Group (NSG) Rules

The Problem

Network Security Groups (NSGs) are the primary network‑level control for Azure VMs. Production environments often contain overly permissive inbound rules such as:

  • Allow * from 0.0.0.0/0
  • SSH (22) or RDP (3389) exposed to the public internet

Real‑world example

During a security audit, 12 production VMs were found with SSH open to the internet. NSG flow logs showed 50,000+ failed login attempts in 30 days.

How to Detect

Azure Portal

  • Network Security Groups → Inbound rules – look for:
    • Source: Any or 0.0.0.0/0
    • Ports: 22, 3389, 1433, 5432
  • Review Effective security rules per VM

KQL (NSG Flow Logs)

AzureNetworkAnalytics_CL
| where SubType_s == "FlowLog"
| where SrcIP_s !startswith "10."
  and SrcIP_s !startswith "172."
  and SrcIP_s !startswith "192.168."
| where DestPort_d in (22, 3389)
| summarize Attempts = count() by SrcIP_s, DestPort_d
| order by Attempts desc

The Fix

Immediate

  • Remove 0.0.0.0/0 access for SSH/RDP
  • Whitelist trusted office or VPN IPs
  • Use Azure Bastion for secure remote access
  • Enable Just‑In‑Time (JIT) VM access via Microsoft Defender for Cloud

Long‑term

  • Adopt Application Security Groups (ASGs)
  • Deploy Azure Firewall for centralized filtering
  • Enforce NSG standards with Azure Policy

2. Missing Azure Backup Policies

The Problem

Many teams assume Azure automatically backs up VMs—it does not. Production workloads have been found without a Recovery Services vault or any backup schedule.

How to Detect

Azure Portal

  • Recovery Services vaults → Backup items – compare against the VM inventory.

KQL Query

Resources
| where type == "microsoft.compute/virtualmachines"
| project name, resourceGroup
| join kind=leftouter (
    RecoveryServicesResources
    | where type contains "protectedItems"
    | extend vmName = tostring(split(properties.sourceResourceId, "/")[8])
    | project vmName
) on $left.name == $right.vmName
| where isnull(vmName)

The Fix

  • Create Recovery Services vaults per region.
  • Define backup policies aligned with your RPO/RTO.
  • Enable Soft Delete.
  • Test restores quarterly.
  • Configure backup alerts via Azure Monitor.

3. Weak Authentication Methods

The Problem

Password‑based SSH and RDP access remains common. Some environments reuse the same password across multiple admin accounts, creating a single point of failure.

How to Detect

Linux

grep -i PasswordAuthentication /etc/ssh/sshd_config
grep -i PubkeyAuthentication /etc/ssh/sshd_config

Windows

  • Review Azure AD Conditional Access policies.
  • Check Azure AD sign‑in logs for password‑based logins.

The Fix

Linux

  • Disable password authentication: PasswordAuthentication no
  • Enforce SSH key‑based authentication.
  • Store private keys in Azure Key Vault.
  • Enable Azure AD Login for Linux VMs.

Windows

  • Enforce Multi‑Factor Authentication (MFA).
  • Minimize local admin usage.
  • Deploy Privileged Access Workstations (PAWs).

4. Unencrypted Data in Transit

The Problem

HTTP endpoints, databases without TLS, and FTP transfers still exist in production, even for sensitive data.

How to Detect

Application Gateway / WAF Logs

AzureDiagnostics
| where ResourceType == "APPLICATIONGATEWAYS"
| where requestUri_s startswith "http://"
| summarize Count = count() by requestUri_s, clientIP_s
| order by Count desc

The Fix

  • Enforce HTTPS end‑to‑end.
  • Redirect http://https://.
  • Enable TLS 1.2+ for all database connections.
  • Replace FTP with SFTP/FTPS.
  • Manage certificates via Azure Key Vault.

5. Improper Role‑Based Access Control (RBAC)

The Problem

Developers often hold Contributor or Owner rights at the subscription level, violating the principle of least privilege.

How to Detect

authorizationresources
| where type == "microsoft.authorization/roleassignments"
| where properties.roleDefinitionId contains "Owner"
   or properties.roleDefinitionId contains "Contributor"
| project principalId, scope = tostring(properties.scope)

The Fix

  • Conduct regular RBAC audits.
  • Remove unnecessary subscription‑level roles.
  • Create custom roles with minimal permissions.
  • Use Azure AD Privileged Identity Management (PIM).
  • Enable access reviews for high‑privilege accounts.

6. Missing Activity Log Alerts

The Problem

Critical changes (e.g., NSG rule updates, VM deletions) occur without any alerts, leaving teams unaware.

How to Detect

  • Azure Monitor → Alerts – filter by Activity Log and verify that relevant categories are being monitored.

The Fix

Create alerts for:

  • NSG rule changes
  • VM creation/deletion
  • RBAC modifications
  • Defender for Cloud policy updates
  • Key Vault access changes

7. Exposed Management and Database Ports

The Problem

Beyond SSH/RDP, database ports and admin interfaces are often exposed to the internet (e.g., SQL, MySQL, PostgreSQL, MongoDB, Redis).

How to Detect

AzureNetworkAnalytics_CL
| where SubType_s == "FlowLog"
| where DestPort_d in (1433, 3306, 5432, 8080, 8443, 27017, 6379)
| where SrcIP_s !startswith "10."
  and SrcIP_s !startswith "172."
  and SrcIP_s !startswith "192.168."
| summarize Count = count() by DestPort_d, DestIP_s
| order by Count desc

The Fix

  • Close non‑essential ports.
  • Use Private Endpoints / Private Link for database access.
  • Deploy Application Gateway with WAF for web‑based admin interfaces.
  • Leverage Azure Bastion for secure remote management.
  • Follow Defender for Cloud recommendations.

Conclusion

Azure security is an ongoing process that requires continuous monitoring, regular audits, and proactive remediation. Most gaps stem from missing guardrails rather than missing tools. Leveraging Azure Policy, Defender for Cloud, and Infrastructure as Code can prevent these issues before they reach production.

Quick Security Checklist (≈ 85 minutes)

  • Audit NSG rules for 0.0.0.0/05 min
  • Verify VM backups – 10 min
  • Review RBAC assignments – 15 min
  • Check SSH password authentication – 10 min
  • Enable Activity Log alerts – 20 min
  • Scan for exposed DB ports – 10 min
  • Identify HTTP traffic – 15 min

Let’s Discuss

Have you encountered similar Azure security issues in production? What guardrails do you use to prevent them?

Back to Blog

Related posts

Read more »