c - 正确的算法但错误的实现？

我是C语言的初学者，对python和java有一定的经验。我想用C解决问题。问题是这样的：

将输入作为句子，单词仅用空格隔开（假设仅小写），并使用以下规则重写句子：

1）如果是第一次出现单词，请保持相同。

2）如果单词出现两次，请用要复制两次的单词替换第二次出现的单词（例如2-> twotwo）。

3）如果单词出现了3次以上，请删除第二个单词之后的所有单词。

将输出打印为句子。输入句子和每个单词的最大长度为500个字符和50个字符。

示例输入：叮当铃叮当铃一直叮当声

示例输出：叮当铃响叮当铃响

我采用的方法是：

1）读取输入，将每个单词分开，并将它们放入char指针数组中。

2）使用嵌套的for循环遍历数组。对于第一个单词之后的每个单词：

 A - If there is no word before it that is equal to it, nothing happens.

 B - If there is already one word before it that is equal to it, change the word as its "doubled form".

 C - If there is already a "doubled form" of itself that exists before it, delete the word (set the element to NULL.

3）打印修改后的数组。

我对这种方法的正确性很有信心。但是，当我实际编写代码时：

'''

int main()
{

    char input[500];

    char *output[500];


    // Gets the input
    printf("Enter a string: ");
    gets(input);

    // Gets the first token, put it in the array
    char *token = strtok(input, " ");
    output[0] = token;

    // Keeps getting tokens and filling the array, untill no blank space is found
    int i = 1;
    while (token != NULL) {
        token = strtok(NULL, " ");
        output[i] = token;
        i++;
    }


    // Processes the array, starting from the second element
    int j, k;
    char *doubled;
    for (j = 1; j < 500; j++) {
        strcpy(doubled, output[j]);
        strcat(doubled, doubled);        // Create the "doubled form"
        for (k = 0; k < j; k++) {
            if (strcmp(output[k], output[j]) == 0) {     // Situation B
                output[j] = doubled;
            }
            if (strcmp(output[k], doubled) == 0) {       // Situation C
                output[j] = ' ';
            }
        }
    }


    // Convert the array to a string
    char *result = output[0];          // Initialize a string with the first element in the array
    int l;
    char *blank_space = " ";           // The blank spaces that need to be addded into the sentence
    for (l = 1; l < 500; l++) {
        if (output[l] != '\0'){        // If there is a word that exists at the given index, add it
            strcat(result, blank_space);
            strcat(result, output[l]);

        }
        else {                         // If reaches the end of the sentence
            break;
        }
    }

    // Prints out the result string
    printf("%s", result);

    return 0;
}

'''

我对每个单独的块进行了一系列测试。有几个问题：

1）在处理数组时，循环中的strcmp，strcat和strcpy似乎会给出Segmentation错误报告。

2）打印数组时，单词未显示其应该执行的顺序。

我现在很沮丧，因为问题似乎都来自我的代码的某些内部结构缺陷，并且与我不太熟悉的C的内存机制有很大关系。我该如何解决？

最佳答案

一个问题突然出现在我身上。这段代码是错误的：

char *doubled;
for (j = 1; j < 500; j++) {
    strcpy(doubled, output[j]);
    strcat(doubled, doubled);        // Create the "doubled form"

doubled不指向任何实际内存。因此，尝试将数据复制到其指向的位置是未定义的行为，几乎可以肯定会导致SIGSEGV-如果不引起SIGSEGV，它将破坏内存。

这需要解决-您不能将带有strcpy()或strcat()的字符串复制到不指向实际内存的指针。

这样会更好，但仍然不理想，因为不进行检查以确保没有缓冲区溢出：

char doubled[ 2000 ];
for (j = 1; j < 500; j++) {
    strcpy(doubled, output[j]);
    strcat(doubled, doubled);        // Create the "doubled form"

这样定义的doubled也是一个问题：

        if (strcmp(output[k], output[j]) == 0) {     // Situation B
            output[j] = doubled;
        }

这只是将output[j]指向doubled。下一个循环迭代将覆盖doubled，并且output[j]仍指向的数据将被更改。

这将解决该问题：

        if (strcmp(output[k], output[j]) == 0) {     // Situation B
            output[j] = strdup( doubled );
        }

strdup()是POSIX函数，毫无疑问，它会复制一个字符串。不过，该字符串以后需要free()，因为strdup()与以下相同：

char *strdup( const char *input )
{
    char *duplicate = malloc( 1 + strlen( input ) );
    strcpy( duplicate, input );
    return( duplicate );
}

如前所述，strcat(doubled, doubled);也是一个问题。一种可能的解决方案：

    memmove(doubled + strlen( doubled ), doubled, 1 + strlen( doubled ) );

从原始doubled终止符开始，将'\0'字符串的内容复制到内存中。请注意，由于原始的'\0'终止符是字符串的一部分，因此不能使用strcpy( doubled + strlen( doubled ), doubled );。出于相同的原因，您也不能使用memcpy()。

Doubled

c - 正确的算法但错误的实现？