【LOJ511】[LibreOJ NOI Round #1]验题（动态DP）

我这道题写了整！整！三！天！

我要一定要写这篇博客来表达我复！杂！的！心！情！

题目

LOJ511

官方题解（这个题解似乎不是很详细，我膜 std 才看懂的）

调这道题验证了我校某人的一句话：调题是一个熵减的过程，和 std 越来越像。

分析

由于我写了三天，我也想让你们跟我一样自行去看 std 然后写三天，所以我决定不写这题的题解。

我的思路和题解不完全一样。然而由于上述原因，我最终越写越像 std 。

为了方便叙述，假设我们已经在意大利面的帮助下知道了如何在规定某些点必须选，某些点不能选，某些点可选可不选的情况下求出（动态维护）独立集的方案数。

这是一个跟字典序有关的问题，（按照套路）自然想到逐位确定。如果没有 \(I\) 的限制，仅仅是求字典序第 \(k\) 小的独立集，就非常好做。假设现在已经确定了一个前缀，前缀的最后一位是 \(x\) ，那么所有不超过 \(x\) 的数以后都不能再选了。从小到大枚举这一位选的数 \(i(i>x)\) ，然后计算有多少个以这个前缀开始的独立集（使用上述意大利面的方法）。设这个数量为 \(t\) 。如果 \(t>k\) ，说明当前前缀就是答案的前缀，给 \(k\) 减去 \(1\) ，考虑下一位（因为决定往后继续选就越过了「只选当前前缀」这个方案）；否则，给 \(k\) 减去 \(t\) ，继续枚举这一位。

那么如果有 \(I\) 的限制呢？现在要做两件事：第一，求出答案和 \(I\) 的最长公共前缀（下称 LCP ）是多少，设为 \(p\) ；第二，求出答案是以这个 LCP 为前缀的第几个，也就是要把 \(k\) 减去所有字典序在 \(I\) 之后而 LCP 比 \(p\) 更大的方案数（即介于 \(I\) 和 LCP 之间的方案数），设为 \(k'\) 。这样，我们只需要用上一段所说的方法求以 LCP 为前缀的第 \(k'\) 个就行了。

从大往小枚举 LCP 的长度 \(0\leq i \leq n\) 。若 LCP 的长度严格为 \(i-1\) ，则 \(x=I_j,j<i\) 必须选，\(x< I_i,x\notin I\) 和 \(I_i\) 都不能选，\(x>I_i\) 可选可不选。这可以用意大利面法算出方案数 \(t\) 。如果 \(k \geq t\) ，就给 \(k\) 减去 \(t-1\) （因为「只选当前前缀」这种方案不比 \(I\) 的字典序大）。否则，说明 \(p=i-1\) 。

好了，那么剩下的问题就是如何在规定了某些点必须选，某些点不能选，某些点可选可不选的情况下动态维护独立集的方案数，也就是上面说的意大利面法。当然，这个方法跟意大利面没有任何关系，我只是随口说了一个名字。

如果参加过 NOIP2018 或者看过【知识总结】动态 DP ，一定能发现这是一个赤裸裸的动态 DP —— 严谨地说，这是一个计数问题而不是 DP ，但这也使得它使用的是原始的关于加法和乘法的矩阵乘法，而不是动态 DP 中魔改过的关于取最值和加法的矩阵乘法。当然，定义的问题并不影响照 sys 画嘴子照葫芦画瓢地解决这个问题：设 \(f_{u,0/1}\) 表示当 \(u\) 没选 / 选了时子树中的方案数，\(g_{u,0/1}\) 表示\(u\) 没选 / 选了且不考虑 \(u\) 的重儿子（设为 \(v\) ）时的方案数。那么就有：

\[\begin{bmatrix}
g_{u,0},g_{u,0}\\
g_{u,1},0\\
\end{bmatrix}
\begin{bmatrix}
f_{v,0}\\
f_{v,1}
\end{bmatrix}=
\begin{bmatrix}
f_{u,0}\\f_{u,1}\end{bmatrix}\]

和动态 DP 类似，先树链剖分，用线段树维护重链上矩阵的乘积，修改时用链首的 \(f\) （也就是整条重链的矩阵的乘积）去更新链首父亲的 \(g\) 。

还存在一个问题：独立集方案数可能非常多，long long 存不下。如果方案数大于 \(k\) ，就不需要知道它的确切值，所以可以直接在计算时对一个大数（如 \(2\times10^{18}\) ）取 \(\min\) 。但在这种情况下，由于过大的数没有记录准确值，所以不能通过减去旧贡献、加上新贡献来更新链首父亲的 \(g\) 。考虑 \(g\) 是怎么算的。设 \(v\) 是 \(u\) 的轻儿子：

\[g_{u,1}=\begin{cases}\begin{aligned}&\prod f_{v,0}\ \ &u\ 可以选 \\ &0\ &u\ 不能选\end{aligned}\end{cases}
\]

\[g_{u,0}=\begin{cases}\begin{aligned}&\prod (f_{v,0}+f_{v,1})\ \ &u\ 可以不选 \\ &0\ &u\ 必须选\end{aligned}\end{cases}
\]

给每个结点开一棵线段树，在上面维护所有轻儿子的 \((f_{v,0}+f_{v,1})\) 的乘积和 \(f_{v,0}\) 的乘积。更新 \(g\) 时先修改链首父亲的线段树，然后通过上述公式构造出 \(g\) 。

完结撒花？时间复杂度 \(O(n\log^2n)\) 。

教你如何从 1 分 20 秒卡常到 7 秒以内

睁大眼睛看看数据范围！\(n\leq 10^6\) 啊！算法的复杂度是 \(O(n\log^2n)\) 啊！瓶颈上还是个常数巨巨巨大的矩阵乘法啊！这个矩阵乘法还要先用 double 来估算答案看是否对 \(2\times10^{18}\) 取 \(\min\) 啊！

也就是在这个卡常的过程中，我和 std 越写越像。

开多棵线段树

注意到这里查询只会查一整条重链的信息。因此可以给每条重链单独开一个线段树，这样不仅可能使深度减少，更重要的是去掉了区间查询这个常数巨大的操作，改为 \(O(1)\) 的全局查询。同理也可以给每个结点单独开一个线段树维护所有轻儿子。

zkw 线段树

只有单点修改和全局查询，用 zkw 线段树非常好写。代码很简单，即使像我一样之前没学过 zkw 线段树的人应该也能看懂。

直接构造初始化

我一开始写的初始化是这样的（把所有点设为不能选）：

for (int i = 1; i <= n; i++)

	update(i, 0);

这相当于跑满了 \(O(n\log^2n)\) ，巨慢无比。当所有点都不能选时，对于每个点 \(u\) 都有 \(f_{u,0}=g_{u,0}=1,f_{u,1}=g_{u,1}=0\) 。因此直接对每个点修改线段树上的值即可。

代码

#include <cstdio>

#include <algorithm>

#include <cstring>

#include <cctype>

using namespace std;

namespace zyt

{

	template<typename T>

	inline bool read(T &x)

	{

		char c;

		bool f = false;

		x = 0;

		do

			c = getchar();

		while (c != EOF && c != '-' && !isdigit(c));

		if (c == EOF)

			return false;

		if (c == '-')

			f = true, c = getchar();

		do

			x = x * 10 + c - '0', c = getchar();

		while (isdigit(c));

		if (f)

			x = -x;

		return true;

	}

	template<typename T>

	inline void write(T x)

	{

		static char buf[20];

		char *pos = buf;

		if (x < 0)

			putchar('-'), x = -x;

		do

			*pos++ = x % 10 + '0';

		while (x /= 10);

		while (pos > buf)

			putchar(*--pos);

	}

	typedef long long ll;

	typedef long double ld;

	typedef pair<ll, ll> pll;

	const int N = 1e6 + 10;

	const ll LINF = 0x3f3f3f3f3f3f3f3fLL;

	int n, m, fix[N], I[N], x[N], y[N], head[N], ecnt;

	ll k, tot;

	struct edge

	{

		int to, next;

	}e[N << 1];

	void add(const int a, const int b)

	{

		e[ecnt] = (edge){b, head[a]}, head[a] = ecnt++;

	}

	inline ll mul(const ll a, const ll b)

	{

		//++tot;

		if ((double)a * b > LINF)

			return LINF;

		else

			return a * b;

	}

	pll operator * (const pll &a, const pll &b)

	{

		return pll(mul(a.first, b.first), mul(a.second, b.second));

	}

	class Matrix

	{

	private:

		int n, m;

	public:

		ll data[2][2];

		Matrix(const int _n = 0, const int _m = 0)

			: n(_n), m(_m)

		{

			for (int i = 0; i < n; i++)

				memset(data[i], 0, sizeof(ll[m]));

		}

		Matrix operator * (const Matrix &b) const

		{

			Matrix ans(n, b.m);

			for (int i = 0; i < n; i++)

				for (int k = 0; k < m; k++)

					for (int j = 0; j < b.m; j++)

						ans.data[i][j] = min(LINF, ans.data[i][j] + mul(data[i][k], b.data[k][j]));

			return ans;

		}

	};

	void init(Matrix &m)

	{

		m = Matrix(2, 2);

		m.data[0][0] = m.data[1][1] = 1;

	}

	void init(pll &p)

	{

		p.first = p.second = 1;

	}

	inline pll ftop(const Matrix &m)

	{

		return pll(m.data[0][0] + m.data[1][0], m.data[0][0]);

	}

	inline Matrix ptog(const pll &p)

	{

		Matrix g(2, 2);

		g.data[0][0] = g.data[0][1] = p.first;

		g.data[1][0] = p.second;

		return g;

	}

	template<typename T>

	class Segment_Tree

	{

		struct node

		{

			T m;

		}*tree;

		int m;

	public:

		Segment_Tree(const int n = 0)

		{

			m = 1;

			while (m <= n)

				m <<= 1;

			tree = new node[m << 1];

			for (int i = 0; i < (m << 1); i++)

				init(tree[i].m);

		}

		void change(const int pos, const T &x)

		{

			int p = pos + m;

			tree[p].m = x;

			while (p > 1)

				p >>= 1, tree[p].m = tree[p << 1].m * tree[p << 1 | 1].m;

		}

		T query()

		{

			return tree[1].m;

		}

	};

	Segment_Tree<Matrix>t1[N];

	Segment_Tree<pll>t2[N];

	namespace Tree_Chain_Cut

	{

		int dfn[N], dfn2[N], len[N], num[N], top[N], size[N], son[N], fa[N], end[N];

		void dfs1(const int u, const int f)

		{

			fa[u] = f, size[u] = 1;

			for (int i = head[u]; ~i; i = e[i].next)

			{

				int v = e[i].to;

				if (v == f)

					continue;

				dfs1(v, u);

				size[u] += size[v];

				if (size[v] > size[son[u]])

					son[u] = v;

			}

		}

		void dfs2(const int u, const int t)

		{

			top[u] = t;

			end[t] = u;

			for (int i = head[u]; ~i; i = e[i].next)

			{

				int v = e[i].to;

				if (v == fa[u] || v == son[u])

					continue;

				dfn2[v] = ++num[u];

			}

			dfn[u] = ++len[top[u]];

			if (son[u])

				dfs2(son[u], t);

			for (int i = head[u]; ~i; i = e[i].next)

			{

				int v = e[i].to;

				if (v == fa[u] || v == son[u])

					continue;

				dfs2(v, v);

			}

		}

	}

	void update(int u, const int f)

	{

		using namespace Tree_Chain_Cut;

		fix[u] = f;

		t1[top[u]].change(dfn[u],

			ptog(t2[u].query() * pll(fix[u] == 1 ? 0 : 1, fix[u] == 0 ? 0 : 1)));

		while (fa[top[u]])

		{

			t2[fa[top[u]]].change(dfn2[top[u]], ftop(t1[top[u]].query()));

			t1[top[fa[top[u]]]].change(dfn[fa[top[u]]],

				ptog(t2[fa[top[u]]].query()

					* pll(fix[fa[top[u]]] == 1 ? 0 : 1, fix[fa[top[u]]] == 0 ? 0 : 1)));

			u = fa[top[u]];

		}

	}

	ll cal()

	{

		using namespace Tree_Chain_Cut;

		Matrix tmp = t1[1].query();

		return min(tmp.data[0][0] + tmp.data[1][0], LINF);

	}

	int ans[N], cnt;

	int work()

	{

		using namespace Tree_Chain_Cut;

		read(n), read(k);

		memset(head, -1, sizeof(int[n + 1]));

		for (int i = 1; i < n; i++)

			read(x[i]), ++x[i];

		for (int i = 1; i < n; i++)

			read(y[i]), ++y[i];

		for (int i = 1; i < n; i++)

			add(x[i], y[i]), add(y[i], x[i]);

		dfs1(1, 0), dfs2(1, 1);

		for (int i = 1; i <= n; i++)

		{

			if (i == top[i])

				t1[i] = Segment_Tree<Matrix>(len[i]);

			t2[i] = Segment_Tree<pll>(num[i]);

		}

		Matrix A(2, 2);

		A.data[0][0] = A.data[0][1] = 1;

		for (int i = 1; i <= n; i++)

		{

			t1[top[i]].change(dfn[i], A);

			if (dfn2[i])

				t2[fa[i]].change(dfn2[i], pll(1, 1));

		}

		read(m);

		for (int i = 1; i <= m; i++)

			read(I[i]), update(++I[i], 1);

		sort(I + 1, I + m + 1);

		for (int i = I[m] + 1; i <= n; i++)

			update(i, -1);

		ll tmp;

		int pos = 0;

		if (k >= (tmp = cal()))

		{

			k -= tmp - 1;

			for (int i = m; i > 0; i--)

			{

				update(I[i], 0);

				if (k >= (tmp = cal()))

					k -= tmp - 1;

				else

				{

					pos = i;

					break;

				}

				for (int j = I[i - 1] + 1; j <= I[i]; j++)

					update(j, -1);

			}

		}

		else

			pos = m + 1;

		if (!pos)

		{

			puts("");

			return 0;

		}

		for (int i = 1; i < pos; i++)

			ans[++cnt] = I[i];

		for (int i = I[min(pos, m)] + 1; i <= n && k; i++)

		{

			update(i, 1);

			if (k > (tmp = cal()))

				k -= tmp, update(i, 0);

			else

				ans[++cnt] = i, --k;

		}

		//fprintf(stderr, "%lld", tot);

		if (k)

			puts("");

		else

			for (int i = 1; i <= cnt; i++)

				write(ans[i] - 1), putchar(' ');

		return 0;

	}

}

int main()

{

	freopen("511.in", "r", stdin);

	return zyt::work();

}