The average mid-size enterprise with some cloud adoption has resources in 1.8 cloud environments on average. The company that says “we’re on Azure” may also have:
- An AWS development environment that Engineering spun up 3 years ago and forgot about
- A GCP project for a specific data science initiative
- Abandoned resources in all three that are still accruing costs
Multi-cloud discovery in M&A due diligence is not optional. It’s the only way to see the full cloud footprint.
The Three Hyperscalers in an M&A Context
Azure (Most Common)
Azure is the most common primary cloud for companies with significant Microsoft dependencies (M365, Windows Server, SQL Server). M&A due diligence should always include Azure discovery. Key areas:
- Subscriptions and management groups: What’s the organizational structure?
- Resource groups: Are resources logically organized or a sprawl?
- Virtual machines: Running instances, sizes, OS versions, security config
- Storage accounts: Blob storage, file shares, diagnostics data — sensitive data exposure risk
- Azure AD application registrations: What apps have been registered? What permissions have been granted?
AWS (Often Overlooked)
AWS sneak into companies through: engineering teams that prefer AWS services, acquired companies that ran on AWS pre-acquisition, shadow IT cloud adoption by marketing or data teams.
Key AWS discovery areas:
- S3 buckets: Are they public? What data is in them?
- EC2 instances: Running instances across all regions
- IAM users and roles: Are there old IAM users from former employees that are still active?
- Lambda functions: Serverless code running in the target’s account
- Cost anomalies: Any service running in a region the company doesn’t operate in?
GCP (Least Common But Not Rare)
Google Cloud Platform is common in:
- Data science / ML environments (BigQuery, Vertex AI, Dataflow)
- Companies that acquired a startup that ran on GCP
- Development environments where engineers prefer GCP’s tooling
Key GCP discovery areas:
- GCS buckets: Storage equivalents to S3
- Compute Engine instances
- BigQuery datasets: What data is being analyzed? Is any of it personal data?
- Service accounts: GCP equivalent to Azure AD service principals
The ACQI Multi-Cloud Discovery Approach
ACQI’s multi-cloud discovery modules connect via read-only API access to each hyperscaler’s management plane:
For Azure: Read-only Azure Management API access (Azure Resource Graph, Azure AD Graph where still needed)
For AWS: Read-only AWS Config or IAM credential report (read-only access via AWS Organizations if available)
For GCP: Read-only GCP Asset Inventory API
The output is a consolidated multi-cloud inventory: a single view of every cloud resource in every cloud environment the target uses, with a cost estimate and risk classification for each.
The Security Findings That Change the Deal
S3/GCS Bucket ACLs Set to Public
In 2024, a PE firm discovered during IT DD that the target’s primary marketing website’s image assets were stored in an S3 bucket configured as public. Worse: the bucket contained archived customer data from a 2021 campaign that should have been deleted. The data wasn’t personal data, but it was customer data. This was a GDPR Article 5(1)(f) (integrity and confidentiality) issue.
The S3 bucket was immediately made private and a remediation was added to the deal’s integration plan. This finding was caught in DD — and it would have been a post-close incident if not.
Orphaned AWS Accounts
A target company had 4 AWS accounts. The integration team found them via AWS Organizations. Only 2 were in use by the primary business. The third was an abandoned staging environment from a 2022 product launch. The fourth was running up $14,000/month in compute costs for workloads that had been migrated to Azure in Q4 2024 and never shut down.
Total annual cost of cloud waste discovered in this one finding: $168,000/year.
Cross-Cloud Data Transfer Costs
If the target company is running workloads in multiple clouds simultaneously, they are paying egress charges every time data moves between clouds. For companies with significant data volumes (video processing, big data analytics, backup and disaster recovery), cross-cloud egress can be a significant line item that doesn’t appear in the cloud vendor’s bill until 30 days after the invoice is issued.
Multi-cloud discovery identifies which companies have cross-cloud traffic and estimates the egress cost. This goes into the integration plan: does the company consolidate to a single cloud (saving egress costs) or maintain multi-cloud (justified by technical requirements)?
The Compliance Finding That Affects the Deal Model
GDPR Article 28 requires data processing agreements with all processors that handle EU personal data. Cloud providers are processors. If the target company is storing EU personal data in AWS or GCP (or even Azure, outside of EU regions), the transfer mechanism (SCCs, adequacy decisions, binding corporate rules) needs to be documented.
Multi-cloud discovery identifies the geographic location of all cloud resources. If data centers in the US are processing EU personal data without SCCs, this is a compliance gap that needs remediation — and the cost of remediation needs to be in the deal model.
The Multi-Cloud Discovery Checklist
For each cloud environment (Azure, AWS, GCP):
Compute (2 items per environment)
- List all running virtual machines/instances
- Identify all serverless functions (Lambda, Azure Functions, Cloud Functions)
Storage (2 items per environment)
- List all storage buckets/blobs with their ACL and encryption settings
- Flag any buckets that are publicly accessible
Identity (2 items per environment)
- List all service principals/service accounts/IAM roles
- Identify any cross-account access (AWS cross-account roles, Azure Lighthouse, GCP organization-level access)
Cost (1 item per environment)
- Report the estimated monthly cost per environment
- Flag any resources that appear to be orphaned (no network connections, no recent API calls)
Compliance (1 item per environment)
- Identify the data residency of all storage resources
- Flag any personal data processing without documented transfer mechanism