我正在使用c/linux上的libxml2从xml文件中提取信息。我创建了一个函数,它查找某个标记的第一次出现并返回该标记的副本。
例如,给定以下XML文本:
<a><b>First occurrence of tag b<c>Child node</c></b><b>Second occurrence of tag b</b></a>
我想提取第一个标签,如果有孩子的话。
下面是我使用的代码的一个捆绑版本:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
#ifdef LIBXML_TREE_ENABLED
static int
xml_extract_first_occurrence_by_name(xmlNode * start_node, xmlNode * first_occurrence, xmlChar * node_name)
{
xmlNode *cur_node = NULL;
for (cur_node = start_node; cur_node; cur_node = cur_node->next) {
if ( !xmlStrcmp(cur_node->name, node_name) ) {
// use libxml2 function xmlCopyNode to recursively copy cur_node (i.e. copy node + children) to first_occurrence
// (since cur_node will not be valid outside this function)
*first_occurrence = *xmlCopyNode(cur_node, 1);
return 1;
}
// if not found in parent search recursively in children
if (cur_node->children)
if ( xml_extract_first_occurrence_by_name(cur_node->children, first_occurrence, node_name) )
return 1;
}
}
int
main(int argc, char **argv)
{
xmlDoc *doc = NULL;
xmlNode *root_element = NULL;
if (argc != 2)
return(1);
// initialize libxml2 and check for version mismatches
LIBXML_TEST_VERSION
/*parse the file and get the DOM */
doc = xmlReadFile(argv[1], NULL, XML_PARSE_NOBLANKS);
/*Get the root element node */
root_element = xmlDocGetRootElement(doc);
// allocate memory for the node to be extracted (including up to 10 children)
xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));
// search in tag root_element the first occurrence of tag "b"
xml_extract_first_occurrence_by_name(root_element, first_occurrence, "b");
printf("\nFirst occurrence of tag b -> Text content: %s\n", first_occurrence->children->content);
printf("\nFirst occurrence of tag b -> Child tag: %s\n", first_occurrence->children->next->name);
printf("\nFirst occurrence of tag b -> Child tag text content: %s\n", first_occurrence->children->next->children->content);
free(first_occurrence);
/*free the document */
xmlFreeDoc(doc);
/*
*Free the global variables that may
*have been allocated by the parser.
*/
xmlCleanupParser();
return 0;
}
#else
int main(void) {
fprintf(stderr, "Tree support not compiled in\n");
exit(1);
}
#endif
程序按预期运行,也没有编译器警告。如果我通过valgrind运行它(启用泄漏检查),我会得到错误。
valgrind --leak-check=full ./first-occurrence-test.bin first-occurrence-test.xml
==18986== Memcheck, a memory error detector
==18986== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==18986== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==18986== Command: ./first-occurrence-test.bin first-occurrence-test.xml
==18986==
First occurrence of tag b -> Text content: First occurrence of tag b
First occurrence of tag b -> Child tag: c
First occurrence of tag b -> Child tag text content: Child node
==18986==
==18986== HEAP SUMMARY:
==18986== in use at exit: 281 bytes in 8 blocks
==18986== total heap usage: 77 allocs, 69 frees, 47,465 bytes allocated
==18986==
==18986== 281 (60 direct, 221 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 7
==18986== at 0x402A17C: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==18986== by 0x409A01E: ??? (in /usr/lib/i386-linux-gnu/libxml2.so.2.9.1)
==18986== by 0x80487C0: xml_extract_first_occurrence_by_name (first-occurrence-test.c:30)
==18986== by 0x8048848: xml_extract_first_occurrence_by_name (first-occurrence-test.c:36)
==18986== by 0x80488FD: main (first-occurrence-test.c:63)
==18986==
==18986== LEAK SUMMARY:
==18986== definitely lost: 60 bytes in 1 blocks
==18986== indirectly lost: 221 bytes in 7 blocks
==18986== possibly lost: 0 bytes in 0 blocks
==18986== still reachable: 0 bytes in 0 blocks
==18986== suppressed: 0 bytes in 0 blocks
==18986==
==18986== For counts of detected and suppressed errors, rerun with: -v
==18986== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
我试着去掉这个功能,做所有的事情。同样结果。
我在监督什么吗?这可能是xmlcopynode的问题(或者我不知道的问题)吗?还是假阳性?
谢谢你的帮助和评论。
编辑:我开始工作了,关键是确认xmlcopynode()已经分配了内存(不需要再次分配),并使用
xmlFreeNode()
来释放xmlcopynode()分配的内存。谢谢你们的回答。 最佳答案
given that C passes by value, not by reference,
to change where a pointer points to,
a sub function needs the address of the actual pointer
therefore:
this line:
*first_occurrence = *xmlCopyNode(cur_node, 1);
has a few problems:
Below is what needs to be done to correct the problems
and eliminate the memory leak
-------------
note that:
the xmlCopyNode returns a pointer to a a copy of the original node.
I.E. it returns a pointer, that must be (eventually) passed to some form of free()
-------------
Note: the passed parameter: 'first_occurrence'
is already pointing to an allocated memory area via:
(this is where the leak starts)
xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));
-------------
suggest changing this line:
*first_occurrence = *xmlCopyNode(cur_node, 1);
to :
*first_occurrence = xmlCopyNode(cur_node, 1);
-------------
then remove the malloc so this line:
xmlNode *first_occurrence = malloc(11 * sizeof(xmlNode));
becomes:
xmlNode *first_occurrence = NULL;
-------------
finally change this line so the parameter contains the address of
rather than the contents of first_occurrence
xml_extract_first_occurrence_by_name(root_element, first_occurrence, "b");
to this:
xml_extract_first_occurrence_by_name(root_element, &first_occurrence, "b");