algorithm - 在不超过特定组规模的情况下，通过合并组来最大程度地减少组的数量

注：这不是作业，是实习项目。
情况：
你有一个不同大小的n组列表，
任何组都不能包含超过X个元素，
假设您有一个函数merge(G1, G2)，它将groupG2的所有元素添加到groupG1中，并从组列表中删除G2。
编辑：组的每个元素成员对所有组都是唯一的（即，如果组1具有元素A；A不存在于任何其他组中）
问题：
您需要通过合并组合大小小于X的组来最小化组数
我最初的想法是：
使用贪心算法，其功能如下：

sort the arrays by decreasing order,

Then while array.size > 0:

   Pop largest group (lets call it GL) from the main list, and add it to a toBig list

   Then loop through the array until you find a group that can be merged with GL

      Merge the groups and add the merged group to a toRemove list

      Keep going and merging any group that fits

   once loop ends, remove all groups in toRemove from the main list

Continue While Loop

你们对这种方法有什么看法，它会产生最小数量的组（或接近的组）有没有更优雅的方法或更有效的算法？
谢谢你的意见
P.S.我试图搜索这个问题，但我不知道问题的名称是什么，在SE和Google上搜索对它的描述没有得到相关的结果。

最佳答案

首先，实际的元素值对于解决方案来说无关紧要。它们只是一个合并的细节唯一重要的是组中元素的数量换言之，创建一个“计数列表”，其中只包含计数。
对列表进行排序。从低到高或从高到低，都无关紧要。我们要从高端“流行”出来。让我们调用计数列表ginp和输出列表gout让我们调用弹出元素gbig。
所以，如果GBIG的尺寸是X，就把它加到痛风中。一直跳到痛风gtry如果（gtry+gbig）这样做直到吉普筋疲力尽。这是一个“第一次拟合”算法。这是一个工作的基准。因为这类问题，它可能是最好的解决方案，但我想你需要一个“最合适的”，这是一个有点复杂。
考虑另一种策略。有了一个给定的gbig和一个给定的ginp，它们就形成了一个“当前状态”。假设[hi to lo sort]，在gbig被弹出后，ginp[0]将适合在第一次比赛中，我们就拿下了。
但是，假设我们跳过了它（只是为了笑），选择了ginp[1]作为gtry并接受了它。现在，对于gbig的下一个添加，我们可以选择ginp[2]或者跳过它对其余的ginp重复此操作（选择或跳过），直到其中一个gbig达到x或者我们已经超过了ginp的末尾。
最后，现在弹出另一个gbig并重复，只使用上一轮中未选择的元素请注意，您只需不断递归，直到ginp最终耗尽为止在每个步骤中，我们从剩余的部分中选择一个匹配项这样做递归地形成一个[binary]树（基于take/skip），它将枚举所有可能的选择注意，一些节点不会有take，因为ginp中的下一个元素太大（例如g big会溢出X）
在这个递归中，当ginp耗尽时，保持gout中元素计数的最小值（它也需要是一个计数）。根节点的路径为您提供了所需的内容：合并操作列表，等等。只要您得到一个新的最小痛风值，就保存它。
最大树深度将为<n
更新：需要测试数据吗？下面是一个生成器程序：

#!/usr/bin/perl
# grpgen -- generate test data for group reduction problem
#
# arguments:
#   "-X" - maximum value for X
#   "-n" - maximum value for n
#   "-T" - number of tests to generate
#   "-O" - output file
#   "-f" -- generate full element data
#
# NOTE: with no arguments or missing arguments will prompt

master(@ARGV);
exit(0);

# master -- master control
sub master
{
    local(@argv) = @_;

    select(STDOUT);
    $| = 1;

    $Xmax = getstr(2,"-X","maximum for X");
    $nmax = getstr(2,"-n","maximum for n");
    $tstmax = getstr(2,"-T","number of tests");
    $keyf = getstr(1,"-f","generate full element data");
    $ofile = getstr(0,"-O","output file name");

    open($xfdst,">$ofile") ||
        die("master: unable to open '$ofile' -- $!\n");

    for ($tstcur = 1;  $tstcur <= $tstmax;  ++$tstcur) {
        gentest();
    }

    close($xfdst);
}

# getstr -- get a string/number
sub getstr
{
    my($numflg,$opt,$prompt) = @_;
    my($arg);
    my($askflg);
    my($val);

    {
        # search command line for -whatever
        foreach $arg (@argv) {
            if ($arg =~ /^$opt(.*)$/) {
                $val = $1;
                $val = 1
                    if ($numflg && ($val eq ""));
                last;
            }
        }
        last if (defined($val));

        $askflg = 1;

        while (1) {
            printf("Enter ")
                if ($numflg != 1);

            printf("%s",$prompt);

            if ($numflg == 1) {
                printf(" (0/1)? ");
            }
            else {
                printf(": ");
            }

            $val = <STDIN>;
            chomp($val);

            if ($numflg == 0) {
                last if ($val ne "");
                next;
            }

            next unless ($val =~ /^\d+$/);
            $val += 0;

            last if ($val > 0);
            last if ($numflg == 1);
        }
    }

    unless ($askflg) {
        printf("%s: %s\n",$prompt,$val);
    }

    $val;
}

# gentest -- generate a test
sub gentest
{
    local($lhs);
    local($pre);

    $Xlim = getrand($Xmax);
    $xfmt = fmtof($Xlim);

    $nlim = getrand($nmax);
    $nfmt = fmtof($nlim);

    printf($xfdst "\n");
    printf($xfdst "T=%d X=%d n=%d\n",$tstcur,$Xlim,$nlim);

    for ($nidx = 1;  $nidx <= $nlim;  ++$nidx) {
        $xcur = getrand($Xmax);
        if ($keyf) {
            gengrpx();
        }
        else {
            gengrp();
        }
    }

    genout();
}

# gengrp -- generate group (counts only)
sub gengrp
{
    my($rhs);

    $rhs = sprintf($xfmt,$xcur);
    genout($rhs);
}

# gengrpx -- generate group (with element data)
sub gengrpx
{
    my($elidx,$rhs);

    $pre = sprintf("$nfmt:",$nidx);

    # NOTE: this is all irrelevant semi-junk data, so just wing it
    for ($elidx = 1;  $elidx <= $xcur;  ++$elidx) {
        $rhs = sprintf($xfmt,$elidx);
        genout($rhs);
    }

    genout();
}

# genout -- output data
sub genout
{
    my($rhs) = @_;

    {
        if (defined($rhs)) {
            last if ((length($pre) + length($lhs) + length($rhs)) < 78);
        }

        last if ($lhs eq "");

        print($xfdst $pre,$lhs,"\n");
        undef($lhs);
    }

    $lhs .= $rhs
        if (defined($rhs));
}

# getrand -- get random number
sub getrand
{
    my($lim) = @_;
    my($val);

    $val = int(rand($lim));
    $val += 1;

    $val;
}

# fmtof -- get number format
sub fmtof
{
    my($num) = @_;
    my($fmt);

    $fmt = sprintf("%d",$num);
    $fmt = length($fmt);
    $fmt = sprintf(" %%%dd",$fmt);

    $fmt;
}

关于algorithm - 在不超过特定组规模的情况下，通过合并组来最大程度地减少组的数量，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/33407442/