我正在尝试使用番石榴拆分器来分析日志文件。日志文件如下所示:

appName=XXX clientIp=X.X.X timestamp="2017-06-05T13:22:12-07:00" request="POST /forward HTTP/1.1" statusCode=204 bytesOut=1167 totalTime=0.062 bytesIn=1289 sourceHost=XXXX connId=49936598 connReqs=9 upInstance=XXX:104:XXX-XXX:8664:17F34 upConnectSec=0.052 upAddr="XX.XX.XX:123" upHost="vcv08it-cvcv2801:8464" upHdrTimeSec=0.058 upRespTimeSec=0.058 pid=32561  upStatusCode=204 message="Access Log" corrKey=GMIFCDIKRZR2T4VZQXJA2IT6 upCached=- length=0 partition=XXX location="= /v1/tXXXX" xff="XX.XX.XX.XX" referer="-" user-agent="Apache-HttpAsyncClient/4.1.1 (Java/1.8.0_131)\" rateLimitCurrentValues="--" rateLimitTimeMs=\"-:-"


我用下面的代码来解析它:

Map<String, String> parserMap;
parserMap = Splitter.onPattern("\\s(?=([^\\\"]*\\\"[^\\\"]*\\\")*[^\\\"]*$)")
.omitEmptyStrings()
.withKeyValueSeparator(Splitter.onPattern("="))
.split(line);


我的问题是location =“ = / v1 / tXXXX”字段,该字段在字符串内有'=',当前的withKeyValueSeperator无法解析它。您能帮我改变模式以正确获取所有字段吗?

最佳答案

代码中引发java.lang.IllegalArgumentException: Chunk [location="= /v1/tXXXX"] is not a valid entry异常,因为keyValueSeparator在块中多次出现。您可以调整keyValueSeparator,以便仅匹配等号后跟您的值模式。例如。:

final String keyPattern = "\\S+";
final String valuePattern = "(\\S+|\"[^\"]*\")";
parserMap = Splitter.onPattern("\\s(?=" + keyPattern + "=" + valuePattern + ")")
        .omitEmptyStrings()
        .withKeyValueSeparator(Splitter.onPattern("=(?=" + valuePattern + ")"))
        .split(line);


请注意,如果您的行中包含key="key=value"之类的内容,则此方法将无效。

10-04 17:37