我目前正在使用Rook v1.2.2在Kubernetes集群(v1.16.3)上创建一个Ceph集群,但未能在CrushMap上添加机架级别。
我想从:
ID CLASS WEIGHT TYPE NAME
-1 0.02737 root default
-3 0.01369 host test-w1
0 hdd 0.01369 osd.0
-5 0.01369 host test-w2
1 hdd 0.01369 osd.1
到这样的东西:
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 0.01358 root default
-5 0.01358 zone zone1
-4 0.01358 rack rack1
-3 0.01358 host mynode
0 hdd 0.00679 osd.0 up 1.00000 1.00000
1 hdd 0.00679 osd.1 up 1.00000 1.00000
就像官方菜鸟文档(https://rook.io/docs/rook/v1.2/ceph-cluster-crd.html#osd-topology)中所述。
我遵循的步骤:
我有一个v1.16.3 Kubernetes集群,其中有1个Master(test-m1)和2个worker(test-w1和test-w2)。
我使用Kubespray的默认配置(https://kubespray.io/#/docs/getting-started)安装了此集群。
我将节点标记为:
kubectl label node test-w1 topology.rook.io/rack=rack1
kubectl label node test-w2 topology.rook.io/rack=rack2
我添加了标签
role=storage-node
和taint storage-node=true:NoSchedule
以强制Rook在特定的存储节点上执行,这是一个存储节点的标签和异味的完整示例:Name: test-w1
Roles: <none>
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=test-w1
kubernetes.io/os=linux
role=storage-node
topology.rook.io/rack=rack1
Annotations: csi.volume.kubernetes.io/nodeid: {"rook-ceph.cephfs.csi.ceph.com":"test-w1","rook-ceph.rbd.csi.ceph.com":"test-w1"}
kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
node.alpha.kubernetes.io/ttl: 0
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Wed, 29 Jan 2020 03:38:52 +0100
Taints: storage-node=true:NoSchedule
我开始部署Rook的common.yml:https://github.com/rook/rook/blob/master/cluster/examples/kubernetes/ceph/common.yaml
我应用了一个自定义的operator.yml文件,以便能够在标有“role = storage-node”的节点上运行该运算符,csi-plugin和代理:
#################################################################################################################
# The deployment for the rook operator
# Contains the common settings for most Kubernetes deployments.
# For example, to create the rook-ceph cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f cluster.yaml
#
# Also see other operator sample files for variations of operator.yaml:
# - operator-openshift.yaml: Common settings for running in OpenShift
#################################################################################################################
# OLM: BEGIN OPERATOR DEPLOYMENT
apiVersion: apps/v1
kind: Deployment
metadata:
name: rook-ceph-operator
namespace: rook-ceph
labels:
operator: rook
storage-backend: ceph
spec:
selector:
matchLabels:
app: rook-ceph-operator
replicas: 1
template:
metadata:
labels:
app: rook-ceph-operator
spec:
serviceAccountName: rook-ceph-system
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: role
operator: In
values:
- storage-node
tolerations:
- key: "storage-node"
operator: "Exists"
effect: "NoSchedule"
containers:
- name: rook-ceph-operator
image: rook/ceph:v1.2.2
args: ["ceph", "operator"]
volumeMounts:
- mountPath: /var/lib/rook
name: rook-config
- mountPath: /etc/ceph
name: default-config-dir
env:
# If the operator should only watch for cluster CRDs in the same namespace, set this to "true".
# If this is not set to true, the operator will watch for cluster CRDs in all namespaces.
- name: ROOK_CURRENT_NAMESPACE_ONLY
value: "false"
# To disable RBAC, uncomment the following:
# - name: RBAC_ENABLED
# value: "false"
# Rook Agent toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: AGENT_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Agent toleration key. Set this to the key of the taint you want to tolerate
# - name: AGENT_TOLERATION_KEY
# value: "storage-node"
# (Optional) Rook Agent tolerations list. Put here list of taints you want to tolerate in YAML format.
- name: AGENT_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-class
operator: Exists
# (Optional) Rook Agent priority class name to set on the pod(s)
# - name: AGENT_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Rook Agent NodeAffinity.
- name: AGENT_NODE_AFFINITY
value: "role=storage-node"
# (Optional) Rook Agent mount security mode. Can by `Any` or `Restricted`.
# `Any` uses Ceph admin credentials by default/fallback.
# For using `Restricted` you must have a Ceph secret in each namespace storage should be consumed from and
# set `mountUser` to the Ceph user, `mountSecret` to the Kubernetes secret name.
# to the namespace in which the `mountSecret` Kubernetes secret namespace.
# - name: AGENT_MOUNT_SECURITY_MODE
# value: "Any"
# Set the path where the Rook agent can find the flex volumes
# - name: FLEXVOLUME_DIR_PATH
# value: "<PathToFlexVolumes>"
# Set the path where kernel modules can be found
# - name: LIB_MODULES_DIR_PATH
# value: "<PathToLibModules>"
# Mount any extra directories into the agent container
# - name: AGENT_MOUNTS
# value: "somemount=/host/path:/container/path,someothermount=/host/path2:/container/path2"
# Rook Discover toleration. Will tolerate all taints with all keys.
# Choose between NoSchedule, PreferNoSchedule and NoExecute:
# - name: DISCOVER_TOLERATION
# value: "NoSchedule"
# (Optional) Rook Discover toleration key. Set this to the key of the taint you want to tolerate
# - name: DISCOVER_TOLERATION_KEY
# value: "storage-node"
# (Optional) Rook Discover tolerations list. Put here list of taints you want to tolerate in YAML format.
- name: DISCOVER_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# (Optional) Rook Discover priority class name to set on the pod(s)
# - name: DISCOVER_PRIORITY_CLASS_NAME
# value: "<PriorityClassName>"
# (Optional) Discover Agent NodeAffinity.
- name: DISCOVER_AGENT_NODE_AFFINITY
value: "role=storage-node"
# Allow rook to create multiple file systems. Note: This is considered
# an experimental feature in Ceph as described at
# http://docs.ceph.com/docs/master/cephfs/experimental-features/#multiple-filesystems-within-a-ceph-cluster
# which might cause mons to crash as seen in https://github.com/rook/rook/issues/1027
- name: ROOK_ALLOW_MULTIPLE_FILESYSTEMS
value: "false"
# The logging level for the operator: INFO | DEBUG
- name: ROOK_LOG_LEVEL
value: "INFO"
# The interval to check the health of the ceph cluster and update the status in the custom resource.
- name: ROOK_CEPH_STATUS_CHECK_INTERVAL
value: "60s"
# The interval to check if every mon is in the quorum.
- name: ROOK_MON_HEALTHCHECK_INTERVAL
value: "45s"
# The duration to wait before trying to failover or remove/replace the
# current mon with a new mon (useful for compensating flapping network).
- name: ROOK_MON_OUT_TIMEOUT
value: "600s"
# The duration between discovering devices in the rook-discover daemonset.
- name: ROOK_DISCOVER_DEVICES_INTERVAL
value: "60m"
# Whether to start pods as privileged that mount a host path, which includes the Ceph mon and osd pods.
# This is necessary to workaround the anyuid issues when running on OpenShift.
# For more details see https://github.com/rook/rook/issues/1314#issuecomment-355799641
- name: ROOK_HOSTPATH_REQUIRES_PRIVILEGED
value: "false"
# In some situations SELinux relabelling breaks (times out) on large filesystems, and doesn't work with cephfs ReadWriteMany volumes (last relabel wins).
# Disable it here if you have similar issues.
# For more details see https://github.com/rook/rook/issues/2417
- name: ROOK_ENABLE_SELINUX_RELABELING
value: "true"
# In large volumes it will take some time to chown all the files. Disable it here if you have performance issues.
# For more details see https://github.com/rook/rook/issues/2254
- name: ROOK_ENABLE_FSGROUP
value: "true"
# Disable automatic orchestration when new devices are discovered
- name: ROOK_DISABLE_DEVICE_HOTPLUG
value: "false"
# Provide customised regex as the values using comma. For eg. regex for rbd based volume, value will be like "(?i)rbd[0-9]+".
# In case of more than one regex, use comma to seperate between them.
# Default regex will be "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Add regex expression after putting a comma to blacklist a disk
# If value is empty, the default regex will be used.
- name: DISCOVER_DAEMON_UDEV_BLACKLIST
value: "(?i)dm-[0-9]+,(?i)rbd[0-9]+,(?i)nbd[0-9]+"
# Whether to enable the flex driver. By default it is enabled and is fully supported, but will be deprecated in some future release
# in favor of the CSI driver.
- name: ROOK_ENABLE_FLEX_DRIVER
value: "false"
# Whether to start the discovery daemon to watch for raw storage devices on nodes in the cluster.
# This daemon does not need to run if you are only going to create your OSDs based on StorageClassDeviceSets with PVCs.
- name: ROOK_ENABLE_DISCOVERY_DAEMON
value: "true"
# Enable the default version of the CSI CephFS driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_CEPHFS
value: "true"
# Enable the default version of the CSI RBD driver. To start another version of the CSI driver, see image properties below.
- name: ROOK_CSI_ENABLE_RBD
value: "true"
- name: ROOK_CSI_ENABLE_GRPC_METRICS
value: "true"
# Enable deployment of snapshotter container in ceph-csi provisioner.
- name: CSI_ENABLE_SNAPSHOTTER
value: "true"
# Enable Ceph Kernel clients on kernel < 4.17 which support quotas for Cephfs
# If you disable the kernel client, your application may be disrupted during upgrade.
# See the upgrade guide: https://rook.io/docs/rook/v1.2/ceph-upgrade.html
- name: CSI_FORCE_CEPHFS_KERNEL_CLIENT
value: "true"
# CSI CephFS plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_CEPHFS_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# CSI Rbd plugin daemonset update strategy, supported values are OnDelete and RollingUpdate.
# Default value is RollingUpdate.
#- name: CSI_RBD_PLUGIN_UPDATE_STRATEGY
# value: "OnDelete"
# The default version of CSI supported by Rook will be started. To change the version
# of the CSI driver to something other than what is officially supported, change
# these images to the desired release of the CSI driver.
#- name: ROOK_CSI_CEPH_IMAGE
# value: "quay.io/cephcsi/cephcsi:v1.2.2"
#- name: ROOK_CSI_REGISTRAR_IMAGE
# value: "quay.io/k8scsi/csi-node-driver-registrar:v1.1.0"
#- name: ROOK_CSI_PROVISIONER_IMAGE
# value: "quay.io/k8scsi/csi-provisioner:v1.4.0"
#- name: ROOK_CSI_SNAPSHOTTER_IMAGE
# value: "quay.io/k8scsi/csi-snapshotter:v1.2.2"
#- name: ROOK_CSI_ATTACHER_IMAGE
# value: "quay.io/k8scsi/csi-attacher:v1.2.0"
# kubelet directory path, if kubelet configured to use other than /var/lib/kubelet path.
#- name: ROOK_CSI_KUBELET_DIR_PATH
# value: "/var/lib/kubelet"
# (Optional) Ceph Provisioner NodeAffinity.
- name: CSI_PROVISIONER_NODE_AFFINITY
value: "role=storage-node"
# (Optional) CEPH CSI provisioner tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI provisioner would be best to start on the same nodes as other ceph daemons.
- name: CSI_PROVISIONER_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# (Optional) Ceph CSI plugin NodeAffinity.
- name: CSI_PLUGIN_NODE_AFFINITY
value: "role=storage-node"
# (Optional) CEPH CSI plugin tolerations list. Put here list of taints you want to tolerate in YAML format.
# CSI plugins need to be started on all the nodes where the clients need to mount the storage.
- name: CSI_PLUGIN_TOLERATIONS
value: |
- effect: NoSchedule
key: storage-node
operator: Exists
# Configure CSI cephfs grpc and liveness metrics port
#- name: CSI_CEPHFS_GRPC_METRICS_PORT
# value: "9091"
#- name: CSI_CEPHFS_LIVENESS_METRICS_PORT
# value: "9081"
# Configure CSI rbd grpc and liveness metrics port
#- name: CSI_RBD_GRPC_METRICS_PORT
# value: "9090"
#- name: CSI_RBD_LIVENESS_METRICS_PORT
# value: "9080"
# Time to wait until the node controller will move Rook pods to other
# nodes after detecting an unreachable node.
# Pods affected by this setting are:
# mgr, rbd, mds, rgw, nfs, PVC based mons and osds, and ceph toolbox
# The value used in this variable replaces the default value of 300 secs
# added automatically by k8s as Toleration for
# <node.kubernetes.io/unreachable>
# The total amount of time to reschedule Rook pods in healthy nodes
# before detecting a <not ready node> condition will be the sum of:
# --> node-monitor-grace-period: 40 seconds (k8s kube-controller-manager flag)
# --> ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS: 5 seconds
- name: ROOK_UNREACHABLE_NODE_TOLERATION_SECONDS
value: "5"
# The name of the node to pass with the downward API
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# The pod name to pass with the downward API
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# The pod namespace to pass with the downward API
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# Uncomment it to run rook operator on the host network
#hostNetwork: true
volumes:
- name: rook-config
emptyDir: {}
- name: default-config-dir
emptyDir: {}
# OLM: END OPERATOR DEPLOYMENT
然后,我应用了自己的自定义ceph-cluster.yml文件,以允许Pod在标有“role = storage-node”的节点上运行
#################################################################################################################
# Define the settings for the rook-ceph cluster with settings that should only be used in a test environment.
# A single filestore OSD will be created in the dataDirHostPath.
# For example, to create the cluster:
# kubectl create -f common.yaml
# kubectl create -f operator.yaml
# kubectl create -f ceph-cluster.yaml
#################################################################################################################
apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
name: rook-ceph
namespace: rook-ceph
spec:
cephVersion:
image: ceph/ceph:v14.2.5
allowUnsupported: true
dataDirHostPath: /var/lib/rook
skipUpgradeChecks: false
mon:
count: 1
allowMultiplePerNode: true
dashboard:
enabled: true
ssl: true
monitoring:
enabled: false # requires Prometheus to be pre-installed
rulesNamespace: rook-ceph
network:
hostNetwork: false
rbdMirroring:
workers: 0
placement:
all:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: role
operator: In
values:
- storage-node
tolerations:
- key: "storage-node"
operator: "Exists"
effect: "NoSchedule"
mgr:
modules:
# the pg_autoscaler is only available on nautilus or newer. remove this if testing mimic.
- name: pg_autoscaler
enabled: true
storage:
useAllNodes: false
useAllDevices: false
nodes:
- name: "test-w1"
directories:
- path: /var/lib/rook
- name: "test-w2"
directories:
- path: /var/lib/rook
使用此配置,Rook不会将标签应用于“粉碎贴图”。
如果我安装了toolbox.yml(https://rook.io/docs/rook/v1.2/ceph-toolbox.html),请进入并运行
ceph osd tree
ceph osd crush tree
我有以下输出:
ID CLASS WEIGHT TYPE NAME
-1 0.02737 root default
-3 0.01369 host test-w1
0 hdd 0.01369 osd.0
-5 0.01369 host test-w2
1 hdd 0.01369 osd.1
如您所见,未定义机架。即使我正确标记了节点。
令人惊讶的是,pod prepare-osd可以检索以下日志的第一行中的信息:
$ kubectl logs rook-ceph-osd-prepare-test-w1-7cp4f -n rook-ceph
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
[couppayy@test-m1 test_local]$ cat preposd.txt
2020-01-29 09:59:07.155656 I | cephcmd: desired devices to configure osds: [{Name: OSDsPerDevice:1 MetadataDevice: DatabaseSizeMB:0 DeviceClass: IsFilter:false IsDevicePathFilter:false}]
2020-01-29 09:59:07.185024 I | rookcmd: starting Rook v1.2.2 with arguments '/rook/rook ceph osd provision'
2020-01-29 09:59:07.185069 I | rookcmd: flag values: --cluster-id=c9ee638a-1d02-4ad9-95c9-cb796f61623a, --data-device-filter=, --data-device-path-filter=, --data-devices=, --data-directories=/var/lib/rook, --encrypted-device=false, --force-format=false, --help=false, --location=, --log-flush-frequency=5s, --log-level=INFO, --metadata-device=, --node-name=test-w1, --operator-image=, --osd-database-size=0, --osd-journal-size=5120, --osd-store=, --osd-wal-size=576, --osds-per-device=1, --pvc-backed-osd=false, --service-account=
2020-01-29 09:59:07.185108 I | op-mon: parsing mon endpoints: a=10.233.35.212:6789
2020-01-29 09:59:07.272603 I | op-osd: CRUSH location=root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.272649 I | cephcmd: crush location of osd: root=default host=test-w1 rack=rack1
2020-01-29 09:59:07.313099 I | cephconfig: writing config file /var/lib/rook/rook-ceph/rook-ceph.config
2020-01-29 09:59:07.313397 I | cephconfig: generated admin config in /var/lib/rook/rook-ceph
2020-01-29 09:59:07.322175 I | cephosd: discovering hardware
2020-01-29 09:59:07.322228 I | exec: Running command: lsblk --all --noheadings --list --output KNAME
2020-01-29 09:59:07.365036 I | exec: Running command: lsblk /dev/sda --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.416812 W | inventory: skipping device sda: Failed to complete 'lsblk /dev/sda': exit status 1. lsblk: /dev/sda: not a block device
2020-01-29 09:59:07.416873 I | exec: Running command: lsblk /dev/sda1 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.450851 W | inventory: skipping device sda1: Failed to complete 'lsblk /dev/sda1': exit status 1. lsblk: /dev/sda1: not a block device
2020-01-29 09:59:07.450892 I | exec: Running command: lsblk /dev/sda2 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.457890 W | inventory: skipping device sda2: Failed to complete 'lsblk /dev/sda2': exit status 1. lsblk: /dev/sda2: not a block device
2020-01-29 09:59:07.457934 I | exec: Running command: lsblk /dev/sr0 --bytes --nodeps --pairs --output SIZE,ROTA,RO,TYPE,PKNAME
2020-01-29 09:59:07.503758 W | inventory: skipping device sr0: Failed to complete 'lsblk /dev/sr0': exit status 1. lsblk: /dev/sr0: not a block device
2020-01-29 09:59:07.503793 I | cephosd: creating and starting the osds
2020-01-29 09:59:07.543504 I | cephosd: configuring osd devices: {"Entries":{}}
2020-01-29 09:59:07.543554 I | exec: Running command: ceph-volume lvm batch --prepare
2020-01-29 09:59:08.906271 I | cephosd: no more devices to configure
2020-01-29 09:59:08.906311 I | exec: Running command: ceph-volume lvm list --format json
2020-01-29 09:59:10.841568 I | cephosd: 0 ceph-volume osd devices configured on this node
2020-01-29 09:59:10.841595 I | cephosd: devices = []
2020-01-29 09:59:10.847396 I | cephosd: configuring osd dirs: map[/var/lib/rook:-1]
2020-01-29 09:59:10.848011 I | exec: Running command: ceph osd create 652071c9-2cdb-4df9-a20e-813738c4e3f6 --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/851021116
2020-01-29 09:59:14.213679 I | cephosd: successfully created OSD 652071c9-2cdb-4df9-a20e-813738c4e3f6 with ID 0
2020-01-29 09:59:14.213744 I | cephosd: osd.0 appears to be new, cleaning the root dir at /var/lib/rook/osd0
2020-01-29 09:59:14.214417 I | cephconfig: writing config file /var/lib/rook/osd0/rook-ceph.config
2020-01-29 09:59:14.214653 I | exec: Running command: ceph auth get-or-create osd.0 -o /var/lib/rook/osd0/keyring osd allow * mon allow profile osd --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format plain
2020-01-29 09:59:17.189996 I | cephosd: Initializing OSD 0 file system at /var/lib/rook/osd0...
2020-01-29 09:59:17.194681 I | exec: Running command: ceph mon getmap --connect-timeout=15 --cluster=rook-ceph --conf=/var/lib/rook/rook-ceph/rook-ceph.config --keyring=/var/lib/rook/rook-ceph/client.admin.keyring --format json --out-file /tmp/298283883
2020-01-29 09:59:20.936868 I | exec: got monmap epoch 1
2020-01-29 09:59:20.937380 I | exec: Running command: ceph-osd --mkfs --id=0 --cluster=rook-ceph --conf=/var/lib/rook/osd0/rook-ceph.config --osd-data=/var/lib/rook/osd0 --osd-uuid=652071c9-2cdb-4df9-a20e-813738c4e3f6 --monmap=/var/lib/rook/osd0/tmp/activate.monmap --keyring=/var/lib/rook/osd0/keyring --osd-journal=/var/lib/rook/osd0/journal
2020-01-29 09:59:21.324912 I | mkfs-osd0: 2020-01-29 09:59:21.323 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.386136 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal FileJournal::_open: disabling aio for non-block journal. Use journal_force_aio to
force use of aio anyway
2020-01-29 09:59:21.387553 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.387585 I | mkfs-osd0: 2020-01-29 09:59:21.384 7fc7e2a8ea80 -1 journal do_read_entry(4096): bad header magic
2020-01-29 09:59:21.450639 I | cephosd: Config file /var/lib/rook/osd0/rook-ceph.config:
[global]
fsid = a19423a1-f135-446f-b4d9-f52da10a935f
mon initial members = a
mon host = v1:10.233.35.212:6789
public addr = 10.233.95.101
cluster addr = 10.233.95.101
mon keyvaluedb = rocksdb
mon_allow_pool_delete = true
mon_max_pg_per_osd = 1000
debug default = 0
debug rados = 0
debug mon = 0
debug osd = 0
debug bluestore = 0
debug filestore = 0
debug journal = 0
debug leveldb = 0
filestore_omap_backend = rocksdb
osd pg bits = 11
osd pgp bits = 11
osd pool default size = 1
osd pool default pg num = 100
osd pool default pgp num = 100
osd max object name len = 256
osd max object namespace len = 64
osd objectstore = filestore
rbd_default_features = 3
fatal signal handlers = false
[osd.0]
keyring = /var/lib/rook/osd0/keyring
osd journal size = 5120
2020-01-29 09:59:21.450723 I | cephosd: completed preparing osd &{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}
2020-01-29 09:59:21.450743 I | cephosd: 1/1 osd dirs succeeded on this node
2020-01-29 09:59:21.450755 I | cephosd: saving osd dir map
2020-01-29 09:59:21.479301 I | cephosd: device osds:[]
dir osds: [{ID:0 DataPath:/var/lib/rook/osd0 Config:/var/lib/rook/osd0/rook-ceph.config Cluster:rook-ceph KeyringPath:/var/lib/rook/osd0/keyring UUID:652071c9-2cdb-4df9-a20e-813738c4e3f6 Journal:/var/lib/rook/osd0/journal IsFileStore:true IsDirectory:true DevicePartUUID: CephVolumeInitiated:false LVPath: SkipLVRelease:false Location: LVBackedPV:false}]
您是否知道问题在哪里,我该如何解决?
最佳答案
我在这篇文章中与Rook Dev讨论了这个问题:https://groups.google.com/forum/#!topic/rook-dev/NIO16OZFeGY
他能够重现问题:
但是似乎该问题仅与使用目录的OSD有关,并且在使用设备(例如RAW设备)时不存在该问题:
如您所见,该问题不会得到解决,因为目录的使用将很快被弃用。
我使用RAW设备而不是目录重新启动了测试,它的工作原理很吸引人。
我要感谢Travis提供的帮助和快速的答复!