一、现象描述:

openstack平台中创建虚拟机后,虚拟机在web页面中显示获取到了ip,但是打开虚拟机控制台后查看网络状态,虚拟机没有ip地址,下图为故障截图:

云计算openstack——虚拟机获取不到ip(13)-LMLPHP

云计算openstack——虚拟机获取不到ip(13)-LMLPHP

二、分析思路:

(1)查看neutron服务状态,确保dchp服务正常运行

root@controller22::~#neutron agent-list
neutron CLI is deprecated and will be removed in the future. Use openstack CLI instead.
+--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+
| id | agent_type | host | availability_zone | alive | admin_state_up | binary |
+--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+
| 3812cb30---bd75-9634687937f6 | DHCP agent | controller | nova | :-) | True | neutron-dhcp-agent |
| 51a30db0--42de-b5d8-6b04e2a13baf | Open vSwitch agent | storage | | :-) | True | neutron-openvswitch-agent |
| 63416b42-376b--b89d-12694faa2bf9 | L3 agent | controller | nova | :-) | True | neutron-l3-agent |
| 7ce3b592-240f--bf09-9a7ecbfa7d3c | Open vSwitch agent | controller | | :-) | True | neutron-openvswitch-agent |
| 851ccdd9-ff14-4e8f-971c-9343787ef056 | Open vSwitch agent | compute | | :-) | True | neutron-openvswitch-agent |
| 8c458dca-a306--a851-1c47a19ab3c1 | Metadata agent | controller | | :-) | True | neutron-metadata-agent |
+--------------------------------------+--------------------+------------+-------------------+-------+----------------+---------------------------+
root@controller22::~#

(2)查看dnsmsp进程是否正常

root@controller22::/var/log/neutron#ps aux | grep dnsmasq
nobody 0.0 0.0 ? S : : dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file=/var/lib/neutron/dhcp/1a426ffe-2bf0--96a5-74402004a17b/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/1a426ffe-2bf0--96a5-74402004a17b/host --addn-hosts=/var/lib/neutron/dhcp/1a426ffe-2bf0--96a5-74402004a17b/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/1a426ffe-2bf0--96a5-74402004a17b/opts --dhcp-leasefile=/var/lib/neutron/dhcp/1a426ffe-2bf0--96a5-74402004a17b/leases --dhcp-match=set:ipxe, --bind-interfaces --interface=tap2c7d9cb9- --dhcp-range=set:tag0,172.16.0.0,static,86400s --dhcp-option-force=option:mtu, --dhcp-lease-max= --conf-file= --domain=openstacklocal
root 0.0 0.0 pts/ R+ : : grep --color=auto dnsmasq
root@controller22::/var/log/neutron#

(3)检查ovs网桥中的 br-int 集成网桥是否有 tap设备 连接到了dchp-agent 的 namesapce上

root@controller22::~#ovs-vsctl show
552eea67--410a-b683-644af569c52d
Manager "ptcp:6640:127.0.0.1"
is_connected: true
Bridge br-ex
Port "eth2"
Interface "eth2"
Port br-ex
Interface br-ex
type: internal
Port "qg-91819abf-e1"
Interface "qg-91819abf-e1"
type: internal
Bridge br-int
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port "tap2c7d9cb9-96"
tag:
Interface "tap2c7d9cb9-96"
type: internal
Port br-int
Interface br-int
type: internal
Port patch-tun
Interface patch-tun
type: patch
options: {peer=patch-int}
Port "qr-4056447b-ea"
tag:
Interface "qr-4056447b-ea"
type: internal
Bridge br-tun
Controller "tcp:127.0.0.1:6633"
is_connected: true
fail_mode: secure
Port "vxlan-c0a8fe97"
Interface "vxlan-c0a8fe97"
type: vxlan
options: {df_default="true", in_key=flow, local_ip="192.168.254.150", out_key=flow, remote_ip="192.168.254.151"}
Port patch-int
Interface patch-int
type: patch
options: {peer=patch-tun}
Port br-tun
Interface br-tun
type: internal
ovs_version: "2.9.0"
root@controller22::~#i

在dhcp命名空间中找到对应网络的 namespace 中找到 br-int 网桥上对应的 tap 设备,然后查看 ip 配置:

root@controller22::/var/log/neutron#ip netns show
qrouter-3028515a-106a-4d77-b2bb-edd34ddbc7c7 (id: )
qdhcp-1a426ffe-2bf0--96a5-74402004a17b (id: )
root@controller22::/var/log/neutron#
root@controller22::/var/log/neutron#
root@controller22::/var/log/neutron#ip netns exec qdhcp-1a426ffe-2bf0--96a5-74402004a17b ip a
: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN group default qlen
link/loopback ::::: brd :::::
inet 127.0.0.1/ scope host lo
valid_lft forever preferred_lft forever
inet6 ::/ scope host
valid_lft forever preferred_lft forever
: tap2c7d9cb9-: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN group default qlen
link/ether fa::3e::: brd ff:ff:ff:ff:ff:ff
inet 172.16.199.10/ brd 172.16.255.255 scope global tap2c7d9cb9-
valid_lft forever preferred_lft forever
inet 169.254.169.254/ brd 169.254.255.255 scope global tap2c7d9cb9-
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe18:/ scope link
valid_lft forever preferred_lft forever
root@controller22::/var/log/neutron#

三、定位问题:

通过以上排查思路分析,br-int 上是有 tap设备 连接到了dhcp-namespace 中,但却是外部网络的dhcp服务ip,没有发现虚拟机所连接的192.168.168.0/24的dhcp-namespace

四、故障处理流程:

(1)找到对应网络的subnet,把 dchp 功能启用,打对勾

云计算openstack——虚拟机获取不到ip(13)-LMLPHP

(2)然后到 subnet 中查看时候有 dhcp 端口且有ip,并检查 dhcp-namespace 中的 tap设备是否有了ip

云计算openstack——虚拟机获取不到ip(13)-LMLPHP

(3)在次查看namespace发现多了一个dhcp-namespace

root@controller23::/var/log/neutron#ip netns show
qdhcp-cb06eada--46e7-bcd8-c9c07937231d (id: )
qrouter-3028515a-106a-4d77-b2bb-edd34ddbc7c7 (id: )
qdhcp-1a426ffe-2bf0--96a5-74402004a17b (id: )

(4)查看dhcp-namespace的 ip 配置,正好是dhcp的服务ip

root@controller23::/var/log/neutron#ip netns exec qdhcp-cb06eada--46e7-bcd8-c9c07937231d ip a
: lo: <LOOPBACK,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN group default qlen
link/loopback ::::: brd :::::
inet 127.0.0.1/ scope host lo
valid_lft forever preferred_lft forever
inet6 ::/ scope host
valid_lft forever preferred_lft forever
: tap865fcb34-fc: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu qdisc noqueue state UNKNOWN group default qlen
link/ether fa::3e:ce:2f:9b brd ff:ff:ff:ff:ff:ff
inet 192.168.168.2/ brd 192.168.168.255 scope global tap865fcb34-fc
valid_lft forever preferred_lft forever
inet 169.254.169.254/ brd 169.254.255.255 scope global tap865fcb34-fc
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fece:2f9b/ scope link
valid_lft forever preferred_lft forever
root@controller23::/var/log/neutron#

(5)重启虚拟机虚拟机,发现获取到 ip 了

云计算openstack——虚拟机获取不到ip(13)-LMLPHP

五、总结

在创建虚拟机下发请求后,dnsmasq进程会给虚拟机分配好mac地址和ip地址,并写入到/var/lib/neutron/dhcp/network-id 目录下的host文件中。虚拟机在内网中发送广播来获取ip的过程中,dnsmasq 会监听到然后将host文件中的对应ip通过dchp-namespace分配给虚拟机。

所以,在虚拟机获取ip过程中,必须虚拟机发出的包可以到达dhcp-namespace 经过的虚拟网络设备都存在且正常工作。

如果没有在subnet中开启上述的dhcp功能,那就少了一个对应网络的name-sapce dhcp服务了,所以虚拟机获取不到 ip。

05-28 11:10