我正在使用c/linux上的libxml2从xml文件中提取信息。我创建了一个函数,它查找某个标记的第一次出现并返回该标记的副本。
例如,给定以下XML文本:

<a><b>First occurrence of tag b<c>Child node</c></b><b>Second occurrence of tag b</b></a>

我想提取第一个标签,如果有孩子的话。
下面是我使用的代码的一个捆绑版本:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libxml/parser.h>
#include <libxml/tree.h>

#ifdef LIBXML_TREE_ENABLED

static int
xml_extract_first_occurrence_by_name(xmlNode * start_node, xmlNode * first_occurrence, xmlChar * node_name)
{
    xmlNode *cur_node = NULL;

    for (cur_node = start_node; cur_node; cur_node = cur_node->next) {
        if ( !xmlStrcmp(cur_node->name, node_name) ) {
            // use libxml2 function xmlCopyNode to recursively copy cur_node (i.e. copy node + children) to first_occurrence
            // (since cur_node will not be valid outside this function)
            *first_occurrence = *xmlCopyNode(cur_node, 1);
            return 1;
        }

        // if not found in parent search recursively in children
        if (cur_node->children)
            if ( xml_extract_first_occurrence_by_name(cur_node->children, first_occurrence, node_name) )
                return 1;
    }
}

int
main(int argc, char **argv)
{
    xmlDoc *doc = NULL;
    xmlNode *root_element = NULL;

    if (argc != 2)
        return(1);

    // initialize libxml2 and check for version mismatches
    LIBXML_TEST_VERSION

    /*parse the file and get the DOM */
    doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS);

    /*Get the root element node */
    root_element = xmlDocGetRootElement(doc);

    // allocate memory for the node to be extracted (including up to 10 children)
    xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));

    // search in tag root_element the first occurrence of tag "b"
    xml_extract_first_occurrence_by_name(root_element, first_occurrence, "b");

    printf("\nFirst occurrence of tag b -> Text content: %s\n", first_occurrence->children->content);
    printf("\nFirst occurrence of tag b -> Child tag: %s\n", first_occurrence->children->next->name);
    printf("\nFirst occurrence of tag b -> Child tag text content: %s\n", first_occurrence->children->next->children->content);

    free(first_occurrence);

    /*free the document */
    xmlFreeDoc(doc);

    /*
     *Free the global variables that may
     *have been allocated by the parser.
     */
    xmlCleanupParser();

    return 0;
}
#else
int main(void) {
    fprintf(stderr, "Tree support not compiled in\n");
    exit(1);
}
#endif

程序按预期运行,也没有编译器警告。如果我通过valgrind运行它(启用泄漏检查),我会得到错误。
valgrind --leak-check=full ./first-occurrence-test.bin first-occurrence-test.xml
==18986== Memcheck, a memory error detector
==18986== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==18986== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==18986== Command: ./first-occurrence-test.bin first-occurrence-test.xml
==18986==

First occurrence of tag b -> Text content: First occurrence of tag b

First occurrence of tag b -> Child tag: c

First occurrence of tag b -> Child tag text content: Child node
==18986==
==18986== HEAP SUMMARY:
==18986==     in use at exit: 281 bytes in 8 blocks
==18986==   total heap usage: 77 allocs, 69 frees, 47,465 bytes allocated
==18986==
==18986== 281 (60 direct, 221 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7
==18986==    at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18986==    by 0x409A01E: ??? (in /usr/lib/i386-linux-gnu/libxml2.so.2.9.1)
==18986==    by 0x80487C0: xml_extract_first_occurrence_by_name (first-occurrence-test.c:30)
==18986==    by 0x8048848: xml_extract_first_occurrence_by_name (first-occurrence-test.c:36)
==18986==    by 0x80488FD: main (first-occurrence-test.c:63)
==18986==
==18986== LEAK SUMMARY:
==18986==    definitely lost: 60 bytes in 1 blocks
==18986==    indirectly lost: 221 bytes in 7 blocks
==18986==      possibly lost: 0 bytes in 0 blocks
==18986==    still reachable: 0 bytes in 0 blocks
==18986==         suppressed: 0 bytes in 0 blocks
==18986==
==18986== For counts of detected and suppressed errors, rerun with: -v
==18986== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

我试着去掉这个功能,做所有的事情。同样结果。
我在监督什么吗?这可能是xmlcopynode的问题(或者我不知道的问题)吗?还是假阳性?
谢谢你的帮助和评论。
编辑:我开始工作了,关键是确认xmlcopynode()已经分配了内存(不需要再次分配),并使用xmlFreeNode()来释放xmlcopynode()分配的内存。谢谢你们的回答。

最佳答案

given that C passes by value, not by reference,
to change where a pointer points to,
a sub function needs the address of the actual pointer
therefore:

this line:
*first_occurrence = *xmlCopyNode(cur_node, 1);
has a few problems:
Below is what needs to be done to correct the problems
and eliminate the memory leak

-------------
note that:
the xmlCopyNode returns a pointer to a a copy of the original node.
I.E. it returns a pointer, that must be (eventually) passed to some form of free()
-------------

Note: the passed parameter: 'first_occurrence'
is already pointing to an allocated memory area via:
(this is where the leak starts)
xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));

-------------
suggest changing this line:
*first_occurrence = *xmlCopyNode(cur_node, 1);

to :
*first_occurrence = xmlCopyNode(cur_node, 1);

-------------

then remove the malloc so this line:
xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));

becomes:
xmlNode *first_occurrence = NULL;

-------------

finally change this line so the parameter contains the address of
rather than the contents of first_occurrence
xml_extract_first_occurrence_by_name(root_element, first_occurrence, "b");

to this:
xml_extract_first_occurrence_by_name(root_element, &first_occurrence, "b");

07-27 17:27