Alibaba Cloud (Aliyun) -- Architecture Patterns¶
This note covers the major architecture patterns for complex project setups on Alibaba Cloud: single-VPC, multi-VPC, multi-account, multi-region, DR, DMZ, and Landing Zone. Each pattern includes the key services, recommended configurations, and real-world examples drawn from official Alibaba Cloud documentation and established practice.
1. Single Project with Single VPC¶
Architecture Summary¶
A single Virtual Private Cloud (VPC) is the foundational building block on Alibaba Cloud. A VPC is an isolated virtual network where you define a private CIDR block, create vSwitches (subnets) across Availability Zones, and attach route tables and gateways. For a single project, one VPC with multi-AZ vSwitches provides isolation, high availability, and simplicity.
Internet --> WAF --> SLB (public vSwitch, AZ-a/b)
|
+--------+--------+
| |
ECS App (private ECS App (private
vSwitch, AZ-a) vSwitch, AZ-b)
| |
RDS/PolarDB (private vSwitch, AZ-a, standby in AZ-b)
Key Services¶
| Service | Role |
|---|---|
| VPC | Isolated virtual network with custom CIDR, vSwitches, route tables |
| vSwitch | Subnet within a VPC, bound to a single AZ; resources attach here |
| Route Table | System route table (auto-created) + custom routes; controls traffic forwarding |
| SLB / ALB | Server Load Balancer (Layer 4/7) distributes traffic across AZs |
| NAT Gateway | Provides outbound Internet (SNAT) and inbound port forwarding (DNAT) for private resources |
| Security Group | Stateful per-instance firewall (like AWS security groups) |
| Network ACL | Stateless subnet-level packet filter (like AWS NACLs) |
| EIP (Elastic IP) | Public IP that can be bound to NAT Gateway, SLB, or individual ECS |
Recommended Configurations¶
- CIDR planning: Use RFC 1918 ranges (
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16). Reserve headroom; VPC CIDR cannot overlap with any VPC you plan to peer later. Use/16for the VPC and/24for vSwitches as a starting point. - Multi-AZ vSwitches: Create at least two vSwitches in different AZs. Place SLB, ECS, and RDS replicas across both.
- Subnet tiers: Separate into public (DMZ), private (application), and data (database) subnets. Use route tables to enforce tiered traffic flow.
- No public IPs on app/DB: Application and database instances should have no EIPs. Use NAT Gateway (SNAT) for outbound Internet and SLB for inbound.
- Security Groups: One SG per tier (web, app, db), least-privilege rules. Allow web SG -> app SG -> db SG only on required ports.
Real-World Example¶
A SaaS startup deploys a three-tier web application in China (Shanghai) region.
One VPC (10.0.0.0/16) with vSwitches in AZ b and g. Public vSwitches
host an ALB instance. Private vSwitches host ECS auto-scaling groups. A managed
RDS MySQL instance runs with multi-AZ HA. NAT Gateway in the public vSwitch
provides egress. Cloud Firewall is enabled on the Internet border.
2. Multi-VPC Architecture¶
When workloads span multiple VPCs -- whether for environment isolation (dev/staging/prod), business-unit separation, or regulatory compliance -- Alibaba Cloud offers three primary inter-VPC connectivity options.
2a. Cloud Enterprise Network (CEN) -- Hub-and-Spoke¶
Architecture Summary¶
CEN is Alibaba Cloud's managed WAN service. A Transit Router (TR) in each region acts as a hub; VPCs, VPN connections, Express Connect circuits, and Cloud Connect Network (CCN) instances attach as spokes. Traffic flows through the TR's route tables, enabling fine-grained routing policy, traffic isolation, and centralized egress.
An Enterprise Edition TR can connect up to 1,000 VPCs in a single region. Inter-region traffic traverses Alibaba Cloud's private backbone with a P99 hourly packet loss rate target below 0.0001%.
+----------------------------+
| CEN Instance |
| |
Region A | Region B |
+--------+ | +--------+ |
| TR-A |-----| TR-B | |
+-+--+--++ | +-+--+--++ |
| | | | | | | |
VPC VPC VPC VPC VPC VBR(Express Connect)
to on-prem DC
Key Services¶
| Service | Role |
|---|---|
| CEN Instance | Container for Transit Routers; one CEN can hold multiple TRs |
| Transit Router (TR) | Regional hub; Enterprise Edition supports custom route tables |
| TR Route Table | System or custom; controls inter-VPC routing, isolation, and traffic steering |
| Bandwidth Package | Cross-region bandwidth allocation (purchased separately) |
| Network Instance Connection | Attachment: VPC connection, VBR connection, VPN connection, CCN connection, ECR connection, inter-region connection |
Recommended Configurations¶
- Use Enterprise Edition TR (required for custom route tables, traffic isolation, and service chaining).
- Hub VPC pattern: Deploy a centralized "shared services" VPC (DNS, NTP, security appliances, NAT egress) attached to the TR. Use custom route tables to let spoke VPCs reach the hub but not each other (isolation).
- Centralized DMZ egress: Route all Internet-bound traffic from spoke VPCs through a hub VPC with Cloud Firewall and NAT Gateway for unified inspection.
- Bandwidth planning: Purchase cross-region bandwidth packages based on actual traffic. Set per-connection bandwidth limits to prevent one spoke from saturating the link.
- Route map policies: Use TR route maps for path selection, route manipulation, and traffic engineering.
Real-World Example¶
A financial services company has separate VPCs for trading, risk analytics, and back-office in China (Shanghai) and China (Beijing). A CEN with Enterprise TRs in both regions connects them. A hub "security" VPC in Shanghai runs Cloud Firewall and IDS. Custom route tables ensure trading VPC traffic passes through the security VPC before reaching the Internet, while back-office VPCs are isolated from trading. Cross-region bandwidth package: 2 Gbit/s between Shanghai and Beijing.
2b. VPC Peering Connection¶
Architecture Summary¶
VPC Peering is a direct, one-to-one connection between two VPCs. It supports same-region and cross-region peering. Unlike CEN, it is a point-to-point link without a central hub. Routes must be manually added to each VPC's route table (or are auto-propagated if using the same account).
VPC-A (10.0.0.0/16) <--peering--> VPC-B (10.1.0.0/16)
Route table: 10.1.0.0/16 -> peering Route table: 10.0.0.0/16 -> peering
Key Services¶
| Service | Role |
|---|---|
| VPC Peering Connection | Direct link between two VPCs |
| VPC Route Table | Manual route entries pointing to the peering connection |
Recommended Configurations¶
- Ensure CIDR blocks do not overlap between peered VPCs.
- Same-account peering is simplest; cross-account peering requires acceptance by the peer account.
- For more than ~5-10 VPCs, prefer CEN over individual peering links to avoid combinatorial explosion of peering connections and route entries.
- VPC Peering is a paid feature (billed by data transfer).
When to Choose VPC Peering over CEN¶
- Small number of VPCs (2-4).
- Simple, static connectivity without complex routing policy.
- Lower cost for low-volume, same-region traffic.
2c. Express Connect (Leased-Line Inter-VPC)¶
Architecture Summary¶
Express Connect provides physically dedicated, private circuits between your data center and Alibaba Cloud VPCs. It can also be used for VPC-to-VPC connections within the same region. A Virtual Border Router (VBR) sits at the Alibaba Cloud edge; an Express Connect Router (ECR) manages global hybrid-cloud routing.
Express Connect circuits offer up to 100 Gbit/s per connection. ECMP across multiple circuits can aggregate to Tbit/s. Network traffic does not traverse the Internet.
Key Services¶
| Service | Role |
|---|---|
| Express Connect Circuit | Physical leased line from partner or direct connect |
| VBR (Virtual Border Router) | Router at Alibaba Cloud edge for circuit termination |
| ECR (Express Connect Router) | Global hybrid-cloud routing with dynamic BGP support |
| Hosted Connection | Shared circuit provided by an Express Connect partner |
Recommended Configurations¶
- Use for hybrid cloud (DC-to-VPC) with guaranteed SLA bandwidth.
- For production workloads, provision redundant circuits to different access points and use ECMP for failover.
- Enable BGP dynamic routing through the ECR for automatic route exchange.
- Express Connect vs. VPN: Express Connect offers higher bandwidth (up to 100 Gbit/s vs. VPN limited by public IP bandwidth), lower latency, and physical isolation. Use VPN as a backup or for non-critical workloads.
Connectivity Comparison Matrix¶
| Criteria | CEN (Transit Router) | VPC Peering | Express Connect |
|---|---|---|---|
| Topology | Hub-and-spoke | Point-to-point | Point-to-point (physical) |
| Scalability | Up to 1,000 VPCs per TR | O(n^2) links for full mesh | Limited by circuit capacity |
| Cross-region | Yes (backbone + bandwidth package) | Yes | Same-region VPC-to-VPC only |
| Routing | Centralized, custom route tables | Manual per-VPC routes | BGP via ECR or static |
| Isolation policy | Fine-grained (route tables) | All-or-nothing | All-or-nothing |
| Provisioning | Minutes (cloud) | Minutes (cloud) | Days-weeks (physical circuit) |
| Cost | Bandwidth packages + TR processing | Data transfer fees | Port + circuit + bandwidth |
| Best for | Enterprise multi-VPC | Small/simple setups | Hybrid cloud, compliance |
3. Multi-Account Strategy¶
Architecture Summary¶
Alibaba Cloud's multi-account strategy centers on Resource Directory -- a hierarchical account management service similar to AWS Organizations. Combined with Cloud Governance Center and Resource Access Management (RAM), it provides centralized governance, billing, and access control across all accounts in an organization.
Resource Directory (Root Account / Management Account)
|
+-- Core OU
| +-- Log Archive Account
| +-- Security/Audit Account
| +-- Shared Services Account (networking, DNS, images)
| +-- Network Account (CEN, Transit Router)
|
+-- Workloads OU
| +-- Business Unit A OU
| | +-- A-Dev Account
| | +-- A-Staging Account
| | +-- A-Prod Account
| +-- Business Unit B OU
| +-- B-Prod Account
|
+-- Sandbox OU
+-- Experiment Account
Key Services¶
| Service | Role |
|---|---|
| Resource Directory | Multi-account hierarchy: root, folders (OUs), member accounts. Supports trusted access delegation to services. |
| Cloud Governance Center | Landing Zone setup, account factory, compliance baselines, drift detection |
| RAM (Resource Access Management) | Fine-grained access control: users, groups, roles, policies. Supports SSO and external IdP federation. |
| Resource Groups | Logical grouping of resources across accounts for unified management |
| Tags | Key-value labels for cost allocation, access control, and automation |
| ActionTrail | Multi-account audit logging (API calls), centralized to a log archive account |
| Cloud Config | Compliance rules and configuration tracking across accounts |
| Control Policies | Organization-level guardrails (similar to AWS SCPs) restricting available actions |
Recommended Configurations¶
- Separate accounts by lifecycle stage: dev, staging, prod in different accounts, not different VPCs in the same account.
- Centralized networking account: Owns the CEN instance, Transit Routers, and shared VPCs. Workload VPCs attach to the TR from their own accounts.
- Centralized logging: All accounts stream ActionTrail logs and SLS (Simple Log Service) logs to the Log Archive account.
- Centralized security: Security Center and Cloud Firewall are managed from the Security account with delegated administration.
- Account Factory: Use Cloud Governance Center's account factory or Terraform/ROS to provision new accounts with baseline configurations (RAM roles, guardrails, logging, networking).
- SSO: Use Alibaba Cloud SSO or federate with an external IdP (Okta, Azure AD, etc.) via SAML. Map IdP groups to RAM roles in each account.
- Cost management: Enable consolidated billing at the management account. Use tags for cost allocation across business units.
Real-World Example¶
A multinational enterprise organizes 50+ Alibaba Cloud accounts under Resource
Directory. Core OU holds the management, log-archive, security, and shared-network
accounts. Each business unit has its own OU with dev/staging/prod accounts. The
shared-network account owns a CEN instance with Enterprise TRs in Shanghai,
Beijing, and Singapore. Workload accounts create VPCs and attach them to the TR
via RAM cross-account authorization. Cloud Governance Center enforces baseline
policies: no public ECS instances, ActionTrail must be enabled, all resources
must have an owner tag.
4. Multi-Zone and Multi-Region Deployment Patterns¶
Multi-Zone (Intra-Region)¶
Alibaba Cloud regions contain multiple Availability Zones (AZs) -- physically isolated data centers with low-latency links (~1-2ms between AZs). Deploying across at least two AZs within a region protects against single-datacenter failure.
Key Services and Features¶
| Service | Multi-AZ Feature |
|---|---|
| ECS | Deploy instances across AZs; use Auto Scaling Group with multi-AZ |
| SLB / ALB | Distributes traffic across AZs; health checks remove unhealthy instances |
| RDS | Multi-AZ deployment with automatic failover (primary in AZ-a, standby in AZ-b) |
| PolarDB | Cluster with one primary node and up to 15 read-only nodes across AZs |
| OSS | Data redundantly stored across AZs by default (Standard and IA storage classes) |
| Redis | Standard (dual-replica) or Cluster edition with multi-AZ deployment |
Recommended Configurations¶
- Minimum two AZs for any production workload.
- SLB listeners should include backend servers in both AZs.
- RDS: enable multi-AZ and automatic failover. Plan for a 30-60 second failover window.
- ECS Auto Scaling Group: set
MultiAZPolicytoBALANCEorCOST_OPTIMIZED.
Multi-Region¶
Deploy across geographically separated regions (e.g., Shanghai + Singapore for APAC coverage) for disaster recovery, compliance, or latency optimization.
Key Services and Features¶
| Service | Multi-Region Feature |
|---|---|
| CEN | Interconnects VPCs across regions via Transit Router inter-region connections |
| Global Acceleration (GA) | Intelligent DNS + anycast acceleration for global user access |
| Alibaba Cloud DNS | Geo-based DNS routing (similar to Route 53) |
| GTM (Global Traffic Manager) | Health-check-based DNS failover between regions |
| DTS (Data Transmission Service) | Real-time data synchronization (RDS, PolarDB, MongoDB, etc.) between regions |
| OSS Cross-Region Replication | Asynchronous object replication between OSS buckets in different regions |
| PolarDB GDN | Global Database Network: storage-level replication across PolarDB clusters |
Recommended Configurations¶
- Use CEN for private inter-region network connectivity; purchase appropriate bandwidth packages.
- Use GTM or Alibaba Cloud DNS for DNS-based failover with health checks.
- Use DTS for database replication (configurable latency, conflict resolution policies).
- Use OSS CRR for object storage replication (async, eventual consistency).
- Deploy stateless application layers to simplify failover; store session state in ApsaraDB for Redis (which supports global replication).
Real-World Example¶
An e-commerce platform deploys in China (Shanghai) as primary and China (Beijing) as secondary. CEN connects the two regions with a 5 Gbit/s bandwidth package. DTS replicates RDS MySQL from Shanghai to Beijing with near-real-time latency. OSS CRR replicates product images. GTM monitors health endpoints in both regions; if Shanghai becomes unhealthy, GTM shifts DNS to Beijing within 30 seconds. Application tier is stateless (ECS + Auto Scaling), so Beijing can scale up from a minimal warm standby to full capacity in minutes.
5. Disaster Recovery (DR)¶
DR Tiers¶
| Tier | RPO | RTO | Pattern | Cost |
|---|---|---|---|---|
| Level 1: Data backup only | Hours-Days | Days | OSS backup, HBR (Hybrid Backup Recovery) | Lowest |
| Level 2: Cold standby | Hours | Hours | Infra defined in IaC; data replicated; no running compute | Low |
| Level 3: Warm standby | Minutes | Minutes | Minimal compute in DR region; scale up on failover | Medium |
| Level 4: Active-passive | Seconds | Seconds | Full stack in both regions; only primary serves traffic | High |
| Level 5: Active-active | ~0 | Seconds | Both regions serve traffic simultaneously; bidirectional replication | Highest |
Active-Passive DR¶
Region A (Primary) Region B (Standby)
+--------------------+ +--------------------+
| SLB --> ECS (full) | DTS/OSS CRR | SLB --> ECS (full) |
| RDS Primary | -------------> | RDS Standby (read) |
| Redis Primary | -------------> | Redis Replica |
+--------------------+ +--------------------+
| |
+-- GTM health check -- DNS failover --+
- DTS replicates data from Region A to Region B in near-real-time.
- Region B runs the full stack but receives no user traffic (GTM routes all traffic to Region A).
- On failure, GTM detects unhealthy Region A endpoint, updates DNS to Region B. Promote RDS secondary to primary in Region B.
Active-Active (Multi-Region)¶
Region A (Active) Region B (Active)
+--------------------+ +--------------------+
| SLB --> ECS | DRC / GDN | SLB --> ECS |
| PolarDB-X Primary | <-----------> | PolarDB-X Primary |
| Redis Cluster | <-----------> | Redis Cluster |
+--------------------+ +--------------------+
| |
+-- GTM geo-based routing (50/50) ----+
- PolarDB-X with DRC (Distributed Relational Coordinator): Supports bidirectional logical replication between regions, enabling writes in both. DRC handles conflict detection and resolution. RPO is approximately 0; RTO is in seconds.
- PolarDB GDN: For active-passive with near-zero RPO. Replicates at the storage layer (physical replication), latency typically under 2 seconds across regions. Only one cluster accepts writes.
- ApsaraDB for Redis Global: Supports bidirectional data synchronization between Redis instances in different regions.
- GTM geo-based routing: Splits traffic by geography (e.g., APAC users to Singapore, US users to Virginia). Per-region health checks enable automatic failover.
Hybrid DR (Cloud + On-Premises)¶
- HBR (Hybrid Backup Recovery): Backs up on-premises data to Alibaba Cloud OSS. Supports file, database, VM, and NAS backups.
- Express Connect / SAG (Smart Access Gateway): Provides private, dedicated connectivity between on-premises DC and Alibaba Cloud VPCs.
- SDI (SAG): Software-defined interconnect for branch offices and DCs.
- DTS: Can replicate between on-premises databases and cloud RDS.
Recommended Configurations¶
- Define RTO/RPO targets upfront; they determine the DR tier and cost.
- Use Infrastructure as Code (Terraform or ROS -- Resource Orchestration Service) to define the full stack. In a cold/warm standby, you can recreate the DR region from code.
- Test failover regularly (quarterly minimum). Use GTM's simulation mode to verify DNS failover without affecting production.
- Stateless app tier: Makes failover dramatically simpler. Externalize all state to managed services (Redis, RDS, OSS).
- Separate DR automation from production: Use different Terraform state / ROS stacks for the DR region to avoid a single point of failure.
6. DMZ Patterns¶
Architecture Summary¶
The DMZ (Demilitarized Zone) pattern on Alibaba Cloud uses subnet tiering within a VPC to create a layered defense. Public-facing resources sit in a "public" vSwitch (DMZ); application and data tiers sit in private vSwitches with no direct Internet exposure.
Internet
|
v
+-----------------------------------------------+
| VPC (10.0.0.0/16) |
| |
| +-- Public vSwitch (AZ-a) -- DMZ Tier ------+|
| | Cloud Firewall (Internet border) ||
| | WAF (for HTTP/HTTPS) ||
| | SLB / ALB (receives traffic) ||
| | NAT Gateway (SNAT for egress) ||
| | Bastion Host / VPN Gateway (admin access) ||
| +--------------------------------------------+|
| | |
| +-- Private vSwitch (AZ-a) -- App Tier -----+|
| | ECS Auto Scaling Group ||
| | ECI (Elastic Container Instance) ||
| | Security Groups (allow SLB -> app only) ||
| +--------------------------------------------+|
| | |
| +-- Private vSwitch (AZ-a) -- Data Tier ----+|
| | RDS / PolarDB (no public IP) ||
| | ApsaraDB for Redis (no public IP) ||
| | OSS (VPC endpoint, no public access) ||
| +--------------------------------------------+|
+-----------------------------------------------+
Key Services¶
| Service | Role in DMZ |
|---|---|
| Cloud Firewall | Three modes: Internet border (N/S on EIP/SLB), VPC border (E/W between VPCs), internal firewall (E/W between ECS). Includes IPS/IDS with threat intelligence. |
| NAT Gateway | Internet NAT Gateway provides SNAT (outbound) and DNAT (inbound port forwarding) for private resources. VPC NAT Gateway resolves overlapping CIDRs. Performance: auto-scales up to 100K CPS, 15 Gbit/s throughput, 2M concurrent connections. |
| SLB (Server Load Balancer) | Layer 4 (CLB -- Classic Load Balancer) and Layer 7 (ALB -- Application Load Balancer). Sits in the DMZ, distributes to private ECS. |
| WAF (Web Application Firewall) | Inspects HTTP/HTTPS traffic for OWASP Top 10, bot management, custom rules. Placed in front of SLB/ALB. |
| Bastion Host | Managed jump server for SSH/RDP access to private instances. Integrates with RAM for access control. |
| VPN Gateway | IPsec-VPN or SSL-VPN for site-to-site or client-to-site private access. Placed in DMZ vSwitch. |
| Security Center | Unified threat detection, vulnerability scanning, compliance checking. |
Recommended Configurations¶
-
Network segmentation: Three tiers of vSwitches -- public (DMZ), private (app), private (data). Route tables enforce traffic flow: DMZ -> app, app -> data. No direct DMZ -> data routing.
-
Cloud Firewall Internet border: Enable on all EIPs, public SLBs, and NAT Gateways. Configure allow-list rules; set IPS to block mode (not just alert). Cloud Firewall operates at layers 3/4 and 7 (HTTP).
-
Cloud Firewall VPC border: If using CEN or VPC peering, enable VPC border firewall. Default-deny; add explicit allow rules between VPCs.
-
Cloud Firewall internal firewall: For micro-segmentation between ECS within a VPC. Tag instances by role (
web,app,db) and write policies based on tags. -
NAT Gateway placement: Deploy in the DMZ vSwitch. Configure SNAT entries for each private vSwitch that needs Internet egress. Optionally use DNAT for specific inbound port forwarding (avoid this if possible; prefer SLB).
-
SLB placement: Deploy in the DMZ vSwitch with EIP or Internet-facing. Backend servers in private vSwitches. Configure health checks. Enable access logging to SLS.
-
Bastion Host / VPN Gateway: Place in DMZ vSwitch. Never assign public IPs to app or DB instances. Use RAM roles and MFA for admin access.
-
Defense-in-depth layers:
| Layer | Control |
|---|---|
| Edge | Cloud Firewall (Internet border) -- DDoS, IPS, access control |
| Perimeter | WAF -- OWASP, bot management, custom rules for HTTP/HTTPS |
| Network | Security Groups (stateful, per-instance) + Network ACLs (stateless, per-subnet) |
| Host | Security Center -- vulnerability scanning, baseline checks |
| Application | Application-level auth (RAM, OAuth, JWT) |
Access Control Policy Matrix (Example)¶
| Source | Destination | Protocol | Ports | Action |
|---|---|---|---|---|
| Internet | SLB (DMZ) | TCP | 443 | Allow |
| Internet | Any | Any | Any | Deny |
| SLB (DMZ) | ECS App (private) | TCP | 8080 | Allow |
| ECS App | RDS (data) | TCP | 3306 | Allow |
| Bastion Host | All ECS | TCP | 22 | Allow |
| Private subnet | NAT GW (DMZ) | TCP | 80, 443 | Allow (SNAT egress) |
| All others | All others | Any | Any | Deny |
Real-World Example¶
A fintech company's production VPC in Shanghai uses three tiers. The DMZ vSwitch hosts a WAF instance, an ALB, a NAT Gateway, and a VPN Gateway. The app vSwitch runs ECS instances in an Auto Scaling Group behind the ALB. The data vSwitch hosts PolarDB MySQL and ApsaraDB for Redis. Cloud Firewall Internet border is enabled on the ALB EIP and NAT Gateway with IPS in block mode. Internal firewall enforces web->app on port 8080, app->db on port 3306, and denies all other cross-tier traffic. All admin access goes through the VPN Gateway to the Bastion Host.
7. Landing Zone Best Practices¶
Architecture Summary¶
A Landing Zone is a pre-configured, governed multi-account environment that provides a secure baseline for all workloads. Alibaba Cloud's Landing Zone implementation uses Cloud Governance Center (built on Resource Directory) to automate account provisioning, apply guardrails, and maintain compliance.
Core Landing Zone Components¶
+-------------------------------------------------------------+
| Landing Zone |
| |
| Management Account (Root) |
| - Resource Directory (account hierarchy) |
| - Cloud Governance Center (guardrails, account factory) |
| - Consolidated billing |
| - RAM SSO / External IdP federation |
| |
| Core Accounts: |
| +-- Log Archive Account |
| | ActionTrail -> SLS (centralized, immutable) |
| +-- Security Account |
| | Cloud Firewall, Security Center, Config rules |
| +-- Network Account |
| | CEN, Transit Routers, shared VPCs, DNS |
| +-- Shared Services Account |
| Base images (ECS), container images (ACR), |
| shared services (KMS, Certificate Manager) |
| |
| Workload OUs: |
| +-- Business Unit A (dev/staging/prod accounts) |
| +-- Business Unit B (dev/staging/prod accounts) |
| +-- Sandbox OU (time-limited experiment accounts) |
+-------------------------------------------------------------+
Key Services¶
| Service | Role in Landing Zone |
|---|---|
| Resource Directory | Account hierarchy: root, folders, member accounts. Enables trusted access for services. |
| Cloud Governance Center | Landing Zone setup wizard, Account Factory, baseline guardrails, compliance dashboards. |
| RAM / SSO | Centralized identity: users, groups, roles. SSO to all member accounts. External IdP federation (SAML/OIDC). |
| ActionTrail | API-level audit logging for all accounts. Configure to deliver to the Log Archive account's OSS bucket or SLS project. |
| Cloud Config | Compliance rules (e.g., "no public ECS," "all resources tagged," "encryption required"). Evaluates resources continuously. |
| Control Policies | Organization-level restrictions on what actions member accounts can perform (analogous to AWS SCPs). Applied at OU or account level. |
| CEN (in Network Account) | Centralized network connectivity for all workload accounts. Transit Router with custom route tables. |
| Cloud Firewall (in Security Account) | Centralized firewall management across all accounts. Internet border, VPC border, and internal firewall. |
| ROS / Terraform | Infrastructure as Code for reproducible Landing Zone deployment. Alibaba Cloud provides reference ROS templates. |
| KMS (Key Management Service) | Centralized encryption key management. Workload accounts delegate to the security account's KMS. |
Recommended Configurations¶
-
Start with Cloud Governance Center: Use the built-in Landing Zone setup wizard to create the core accounts (management, log, security, shared services) with recommended baselines. This is faster and less error-prone than manual setup.
-
Account Factory: Configure Cloud Governance Center's Account Factory to auto-provision new accounts with:
- RAM roles for admin and operator access (via SSO)
- ActionTrail forwarding to Log Archive account
- Cloud Config rules (baseline compliance)
- Network attachment to the central CEN Transit Router
-
Resource tags (owner, environment, cost-center)
-
Guardrails (Control Policies):
- Deny public ECS instances in prod OUs
- Require encryption for OSS and RDS
- Deny creation of resources outside approved regions
-
Require
ownerandenvironmenttags on all resources -
Centralized networking:
- Network Account owns the CEN instance and Transit Routers
- Shared VPCs (via RAM resource sharing) for common services
- Workload VPCs attach to the TR from their own accounts
-
Cloud Firewall VPC border enabled on all TR attachments
-
Centralized logging and monitoring:
- All accounts forward ActionTrail to Log Archive
- SLS (Simple Log Service) centralized in the Log Archive account
-
Cloud Config aggregation in the Security account
-
Identity and access:
- Federate with external IdP (Okta, Azure AD) via SAML
- Map IdP groups to RAM roles in each account
- Enforce MFA for all human users
-
Use RAM roles (not AK/SK) for cross-account service access
-
IaC for the Landing Zone itself:
- Define the Landing Zone in Terraform or ROS
- Version-control the Landing Zone configuration
- Use CI/CD to apply changes (with plan-and-apply pipeline)
-
Alibaba Cloud publishes reference ROS templates for Landing Zone setup
-
Regular drift detection:
- Cloud Governance Center provides compliance dashboards
- Set up alerts for non-compliant resources
- Periodic review of Control Policies
Real-World Example¶
A large retail enterprise builds its Landing Zone using Cloud Governance Center. The management account holds Resource Directory with three top-level OUs: Core, BusinessUnits, and Sandbox. The Core OU contains the log-archive, security, and network accounts. The BusinessUnits OU has OUs per department (Online, Stores, SupplyChain), each with dev/staging/prod accounts. Account Factory provisions new accounts with baseline RAM roles, ActionTrail forwarding, Cloud Config rules, and CEN TR attachment in under 15 minutes. Control Policies enforce: no public ECS in prod, encryption required for all storage, resources only in approved regions (Shanghai, Beijing, Singapore). The security account runs Cloud Firewall in "managed" mode across all accounts, with centralized policy management.
8. Service Naming Quick Reference¶
Alibaba Cloud services often have different marketing names and console names. This table maps the common architecture concepts to the exact Alibaba Cloud service names.
| Concept | Alibaba Cloud Service | Console Abbreviation |
|---|---|---|
| Virtual Network | Virtual Private Cloud | VPC |
| Subnet | vSwitch | vSW |
| Load Balancer (L4/L7) | Server Load Balancer | SLB / CLB |
| Load Balancer (L7, next-gen) | Application Load Balancer | ALB |
| NAT | NAT Gateway | NAT GW |
| Firewall (cloud-native) | Cloud Firewall | CFW |
| WAF | Web Application Firewall | WAF |
| Dedicated line | Express Connect | EC |
| SD-WAN | Smart Access Gateway | SAG |
| WAN / Transit | Cloud Enterprise Network | CEN |
| Transit Hub | Transit Router | TR |
| Edge Router (for Express Connect) | Virtual Border Router | VBR |
| Global routing for Express Connect | Express Connect Router | ECR |
| Multi-account management | Resource Directory | RD |
| Governance / Landing Zone | Cloud Governance Center | CGC |
| IAM | Resource Access Management | RAM |
| IaC (native) | Resource Orchestration Service | ROS |
| Object Storage | Object Storage Service | OSS |
| Relational DB (managed MySQL/PG/SQL Server) | ApsaraDB for RDS | RDS |
| Cloud-native DB (MySQL/PG compatible) | PolarDB | PolarDB |
| Distributed DB | PolarDB-X | PolarDB-X |
| In-memory cache | ApsaraDB for Redis | Redis |
| Data replication | Data Transmission Service | DTS |
| DNS (managed) | Alibaba Cloud DNS | DNS |
| Global traffic failover | Global Traffic Manager | GTM |
| Backup | Hybrid Backup Recovery | HBR |
| Logging | Simple Log Service | SLS |
| Audit trail | ActionTrail | ActionTrail |
| Compliance | Cloud Config | Config |
| Key management | Key Management Service | KMS |
| Security operations | Security Center | Security Center |
| Container registry | Container Registry | ACR |