问题描述
我正在为某些人从DBpedia中提取标签.我现在已经部分成功,但是我陷入了以下问题.以下代码有效.
I am trying to extract labels from DBpedia for some persons. I am partially successful now, but I got stuck in the following problem. The following code works.
public class DbPediaQueryExtractor {
public static void main(String [] args) {
String entity = "Aharon_Barak";
String queryString ="PREFIX dbres: <http://dbpedia.org/resource/> SELECT * WHERE {dbres:"+ entity+ "<http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\"))}";
//String queryString="select * where { ?instance <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person>; <http://www.w3.org/2000/01/rdf-schema#label> ?o FILTER (langMatches(lang(?o),\"en\")) } LIMIT 5000000";
QueryExecution qexec = getResult(queryString);
try {
ResultSet results = qexec.execSelect();
for ( ; results.hasNext(); )
{
QuerySolution soln = results.nextSolution();
System.out.print(soln.get("?o") + "\n");
}
}
finally {
qexec.close();
}
}
public static QueryExecution getResult(String queryString){
Query query = QueryFactory.create(queryString);
//VirtuosoQueryExecution vqe = VirtuosoQueryExecutionFactory.create (sparql, graph);
QueryExecution qexec = QueryExecutionFactory.sparqlService("http://dbpedia.org/sparql", query);
return qexec;
}
}
但是,当实体包含方括号时,它将不起作用.例如,
However, when the entity contains brackets, it does not work. For example,
String entity = "William_H._Miller_(writer)";
导致此异常:
出什么问题了?
推荐答案
进行了一些复制和粘贴,以查看发生了什么事情.我建议您在查询中添加换行符,以提高可读性.您正在使用的查询是:
It took some copying and pasting to see what exactly was going on. I'd suggest that you put newlines in your query for easier readability. The query you're using is:
PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:??? <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
,其中???
被字符串entity
的内容替换.您在此处完全没有进行任何输入验证,以确保粘贴entity
的值是合法的.根据您的问题,听起来entity
包含William_H._Miller_(writer)
,所以您得到了查询:
where ???
is being replaced by the contents of the string entity
. You're doing absolutely no input validation here to ensure that the value of entity
will be legal to paste in. Based on your question, it sounds like entity
contains William_H._Miller_(writer)
, so you're getting the query:
PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
您可以将其粘贴到公共DBpedia端点,您将收到类似的解析错误消息:
You can paste that into the public DBpedia endpoint, and you'll get a similar parse error message:
Virtuoso 37000 Error SP030: SPARQL compiler, line 6: syntax error at 'writer' before ')'
SPARQL query:
define sql:big-data-const 0
#output-format:text/html
define sql:signal-void-variables 1 define input:default-graph-uri <http://dbpedia.org> PREFIX dbres: <http://dbpedia.org/resource/>
SELECT * WHERE
{
dbres:William_H._Miller_(writer) <http://www.w3.org/2000/01/rdf-schema#label> ?o
FILTER (langMatches(lang(?o),"en"))
}
除了用不好的查询打到DBpedia的端点之外,您还可以使用 SPARQL查询验证器,哪个查询报告:
Better than hitting DBpedia's endpoint with bad queries, you can also use the SPARQL query validator, which reports for that query:
在Jena中,可以使用ParameterizedSparqlString避免此类问题.这是您的示例,经过重新设计以使用参数化的字符串:
In Jena, you can use the ParameterizedSparqlString to avoid these sorts of issues. Here's your example, reworked to use a parameterized string:
import com.hp.hpl.jena.query.ParameterizedSparqlString;
public class PSSExample {
public static void main( String[] args ) {
// Create a parameterized SPARQL string for the particular query, and add the
// dbres prefix to it, for later use.
final ParameterizedSparqlString queryString = new ParameterizedSparqlString(
"PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n" +
"SELECT * WHERE\n" +
"{\n" +
" ?entity rdfs:label ?o\n" +
" FILTER (langMatches(lang(?o),\"en\"))\n" +
"}\n"
) {{
setNsPrefix( "dbres", "http://dbpedia.org/resource/" );
}};
// Entity is the same.
final String entity = "William_H._Miller_(writer)";
// Now retrieve the URI for dbres, concatentate it with entity, and use
// it as the value of ?entity in the query.
queryString.setIri( "?entity", queryString.getNsPrefixURI( "dbres" )+entity );
// Show the query.
System.out.println( queryString.toString() );
}
}
输出为:
PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
<http://dbpedia.org/resource/William_H._Miller_(writer)> rdfs:label ?o
FILTER (langMatches(lang(?o),"en"))
}
您可以在公共端点上运行此查询,并获取预期的结果.请注意,如果您使用的entity
不需要特殊的转义,例如
You can run this query at the public endpoint and get the expected results. Notice that if you use an entity
that doesn't need special escaping, e.g.,
final String entity = "George_Washington";
然后查询输出将使用前缀形式:
then the query output will use the prefixed form:
PREFIX dbres: <http://dbpedia.org/resource/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT * WHERE
{
dbres:George_Washington rdfs:label ?o
FILTER (langMatches(lang(?o),"en"))
}
这非常方便,因为您无需执行任何操作即可检查后缀entity
是否具有需要转义的任何字符;耶拿(Jena)为您照顾.
This is very convenient, because you don't have to do any checking about whether your suffix, i.e., entity
, has any characters that need to be escaped; Jena takes care of that for you.
这篇关于带圆括号的sparql查询抛出异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!