本文介绍了如何修复“发生未知错误"同时使用私有 IP 创建多个 Google Cloud SQL 实例时?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们的云后端设置包含 5 个用于 Postgres 实例的 Cloud SQL.我们使用 Terraform 管理我们的基础设施.我们使用公共 IP 和

出现在失败实例的 Google Clod 控制台中的错误是发生未知错误":

以下是重现它的代码.注意 count = 5 行:

资源google_compute_network"private_network"{提供者 =谷歌测试版"名称 = "私有网络"}资源google_compute_global_address"private_ip_address"{提供者 =谷歌测试版"名称 = "私有 IP 地址"目的 = "VPC_PEERING"address_type = "内部"前缀长度 = 16network = "${google_compute_network.private_network.self_link}"}资源google_service_networking_connection"private_vpc_connection"{提供者 =谷歌测试版"network = "${google_compute_network.private_network.self_link}"service = "servicenetworking.googleapis.com"reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]}资源google_sql_database_instance"实例"{提供者 =谷歌测试版"计数 = 5name = "private-instance-${count.index}"database_version = "POSTGRES_9_6"取决于_on = [google_service_networking_connection.private_vpc_connection"]设置{层=db-custom-1-3840"可用性_类型 = "区域"ip_configuration {ipv4_enabled = "假"private_network = "${google_compute_network.private_network.self_link}"}}}提供商谷歌测试版"{版本 = "~> 2.5"凭证 = 凭证.json"项目 = "PROJECT_ID"区域 = "us-central1"zone = "us-central1-a"}

我尝试了几种替代方法:

  • 在创建 google_service_networking_connection 然后同时创建所有实例后等待一分钟,但我遇到了同样的错误.
  • 为每个实例创建地址范围和 google_service_networking_connection,但我收到一个错误,提示无法同时创建 google_service_networking_connection.
  • 为每个实例创建一个地址范围和一个链接到所有实例的 google_service_networking_connection,但我遇到了同样的错误.

解决方案

找到了一个丑陋但有效的解决方案.是 GCP 中的一个错误,虽然无法完成,但它不会阻止同时创建实例.既没有关于它的文档,也没有有意义的错误消息.它也出现在 Terraform Google 提供商问题跟踪器中.

另一种选择是在实例之间添加依赖项.这允许他们的创建成功完成.但是,创建每个实例都需要几分钟时间.这累积到许多花费的分钟.如果我们在实例创建之间添加 60 秒的人为延迟,我们就可以避免失败.备注:

  • 所需的延迟秒数取决于实例层.例如,对于 db-f1-micro,30 秒就足够了.对于 db-custom-1-3840,它们还不够.
  • 我不确定 db-custom-1-3840 所需的确切秒数是多少.30 秒不够,60 秒足够了.

以下是解决问题的代码示例.它仅显示 2 个实例,因为由于 depends_on 限制,我无法使用计数功能,并且显示 5 个实例的完整代码会很长.它对 5 个实例的工作方式相同:

资源google_compute_network"private_network"{提供者 =谷歌测试版"名称 = "私有网络"}资源google_compute_global_address"private_ip_address"{提供者 =谷歌测试版"名称 = "私有 IP 地址"目的 = "VPC_PEERING"address_type = "内部"前缀长度 = 16network = "${google_compute_network.private_network.self_link}"}资源google_service_networking_connection"private_vpc_connection"{提供者 =谷歌测试版"network = "${google_compute_network.private_network.self_link}"service = "servicenetworking.googleapis.com"reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]}当地人{db_instance_creation_delay_factor_seconds = 60}资源null_resource"delayer_1"{depends_on = ["google_service_networking_connection.private_vpc_connection"]供应商本地执行"{command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 0}"}}资源google_sql_database_instance"instance_1"{提供者 =谷歌测试版"name = "private-instance-delayed-1"database_version = "POSTGRES_9_6"取决于_on = ["google_service_networking_connection.private_vpc_connection",null_resource.delayer_1"]设置{层=db-custom-1-3840"可用性_类型 = "区域"ip_configuration {ipv4_enabled = "假"private_network = "${google_compute_network.private_network.self_link}"}}}资源null_resource"delayer_2"{depends_on = ["google_service_networking_connection.private_vpc_connection"]供应商本地执行"{command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 1}"}}资源google_sql_database_instance"instance_2"{提供者 =谷歌测试版"name = "private-instance-delayed-2"database_version = "POSTGRES_9_6"取决于_on = ["google_service_networking_connection.private_vpc_connection",null_resource.delayer_2"]设置{层=db-custom-1-3840"可用性_类型 = "区域"ip_configuration {ipv4_enabled = "假"private_network = "${google_compute_network.private_network.self_link}"}}}提供商谷歌测试版"{版本 = "~> 2.5"凭证 = 凭证.json"项目 = "PROJECT_ID"区域 = "us-central1"zone = "us-central1-a"}提供者空"{版本 = "~> 1.0"}

Our cloud backend setup contains 5 Cloud SQL for Postgres instances. We manage our infrastructure using Terraform. We are using connecting them from GKE using a public IP and the Cloud SQL container.

In order to simplify our setup we wish to get rid of the proxy containers by moving to a private IP. I tried following the Terraform guide. While a creating a single instance works fine, trying to create 5 instances simultaneously ends in 4 failed ones and one successful:

The error which appears in the Google Clod Console on the failed instances is "An Unknown Error occurred":

Following is the code which reproduces it. Pay attention to the count = 5 line:

resource "google_compute_network" "private_network" {
  provider = "google-beta"

  name = "private-network"
}

resource "google_compute_global_address" "private_ip_address" {
  provider = "google-beta"

  name = "private-ip-address"
  purpose = "VPC_PEERING"
  address_type = "INTERNAL"
  prefix_length = 16
  network = "${google_compute_network.private_network.self_link}"
}

resource "google_service_networking_connection" "private_vpc_connection" {
  provider = "google-beta"

  network = "${google_compute_network.private_network.self_link}"
  service = "servicenetworking.googleapis.com"
  reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]
}

resource "google_sql_database_instance" "instance" {
  provider = "google-beta"
  count = 5

  name = "private-instance-${count.index}"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

provider "google-beta" {
  version = "~> 2.5"
  credentials = "credentials.json"
  project = "PROJECT_ID"
  region = "us-central1"
  zone = "us-central1-a"
}

I tried several alternatives:

  • Waiting a minute after creating the google_service_networking_connection and then creating all the instances simultaneously, but I got the same error.
  • Creating an address range and a google_service_networking_connection per instance, but I got an error that google_service_networking_connection cannot be created simultaneously.
  • Creating an address range per instance and a single google_service_networking_connection which links to all of them, but I got the same error.

解决方案

Found an ugly yet working solution. There is a bug in GCP which does not prevent simultaneous creation of instances although it cannot be completed. There is neither documentation about it nor a meaningful error message. It appears in the Terraform Google provider issue tracker as well.

One alternative is adding a dependence between the instances. This allows their creation to complete successfully. However, each instance takes several minutes to create. This accumulates to many spent minutes. If we add an artificial delay of 60 seconds between instance creation, we manage to avoid the failures. Notes:

  • The needed amount of seconds to delay depends on the instance tier. For example, for db-f1-micro, 30 seconds were enough. They were not enough for db-custom-1-3840.
  • I am not sure what is the exact number of needed seconds for db-custom-1-3840. 30 seconds were not enough, 60 were.

Following is a the code sample to resolve the issue. It shows 2 instances only since due to depends_on limitations I could not use the count feature and showing the full code for 5 instances would be very long. It works the same for 5 instances:

resource "google_compute_network" "private_network" {
  provider = "google-beta"

  name = "private-network"
}

resource "google_compute_global_address" "private_ip_address" {
  provider = "google-beta"

  name = "private-ip-address"
  purpose = "VPC_PEERING"
  address_type = "INTERNAL"
  prefix_length = 16
  network = "${google_compute_network.private_network.self_link}"
}

resource "google_service_networking_connection" "private_vpc_connection" {
  provider = "google-beta"

  network = "${google_compute_network.private_network.self_link}"
  service = "servicenetworking.googleapis.com"
  reserved_peering_ranges = ["${google_compute_global_address.private_ip_address.name}"]
}

locals {
  db_instance_creation_delay_factor_seconds = 60
}

resource "null_resource" "delayer_1" {
  depends_on = ["google_service_networking_connection.private_vpc_connection"]

  provisioner "local-exec" {
    command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 0}"
  }
}

resource "google_sql_database_instance" "instance_1" {
  provider = "google-beta"

  name = "private-instance-delayed-1"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection",
    "null_resource.delayer_1"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

resource "null_resource" "delayer_2" {
  depends_on = ["google_service_networking_connection.private_vpc_connection"]

  provisioner "local-exec" {
    command = "echo Gradual DB instance creation && sleep ${local.db_instance_creation_delay_factor_seconds * 1}"
  }
}

resource "google_sql_database_instance" "instance_2" {
  provider = "google-beta"

  name = "private-instance-delayed-2"
  database_version = "POSTGRES_9_6"

  depends_on = [
    "google_service_networking_connection.private_vpc_connection",
    "null_resource.delayer_2"
  ]

  settings {
    tier = "db-custom-1-3840"
    availability_type = "REGIONAL"
    ip_configuration {
      ipv4_enabled = "false"
      private_network = "${google_compute_network.private_network.self_link}"
    }
  }
}

provider "google-beta" {
  version = "~> 2.5"
  credentials = "credentials.json"
  project = "PROJECT_ID"
  region = "us-central1"
  zone = "us-central1-a"
}

provider "null" {
  version = "~> 1.0"
}

这篇关于如何修复“发生未知错误"同时使用私有 IP 创建多个 Google Cloud SQL 实例时?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 16:32