我的查询正在超时,我想了解explain命令的输出以更好地了解问题所在。

首先,我的查询如下:

WITH f (
    SELECT
        /*+ BROADCAST(h) */
        /*+ COALESCE(36) */
        CONCAT(f.outboundlegid, '-', f.inboundlegid, '-', f.agent) AS key,
        f.querydatetime,

        f.outboundlegid,
        f.inboundlegid,
        f.agent,
        f.queryoutbounddate,
        f.queryinbounddate,
        f.price,
        f.outdeparture,
        f.outarrival,
        f.indeparture,
        f.inarrival,
        f.querydestinationplace,
        CASE WHEN type = 'HOLIDAY' AND (out_date BETWEEN start AND end)
            THEN true
            ELSE false
            END out_is_holiday,
        CASE WHEN type = 'LONG_WEEKENDS' AND (out_date BETWEEN start AND end)
            THEN true
            ELSE false
            END out_is_longweekends,
        CASE WHEN type = 'HOLIDAY' AND (in_date BETWEEN start AND end)
            THEN true
            ELSE false
            END in_is_holiday,
        CASE WHEN type = 'LONG_WEEKENDS' AND (in_date BETWEEN start AND end)
            THEN true
            ELSE false
            END in_is_longweekends
    FROM flights f
    CROSS JOIN holidays h
    LIMIT 10
)
 SELECT
    /*+ BROADCAST(a) */
    /*+ BROADCAST(p) */
    key,
    querydatetime,
    first(outboundlegid) as outboundlegid,
    first(inboundlegid) as inboundlegid,
    first(agent) as agent,
    first(p.countryName) as countryName,
    first(p.airportName) as airportName,
    first(a.name) as agentName,
    first(queryoutbounddate) as queryoutbounddate,
    first(queryinbounddate) as queryinbounddate,
    first(price) as price,
    first(outdeparture) as outdeparture,
    first(outarrival) as outarrival,
    first(indeparture) as indeparture,
    first(inarrival) as inarrival,
    first(querydestinationplace) as querydestinationplace,
    CASE WHEN array_contains(collect_set(out_is_holiday), true)
        THEN true
        ELSE false
        END out_is_holiday,
    CASE WHEN array_contains(collect_set(out_is_longweekends), true)
        THEN true
        ELSE false
        END out_is_longweekends,
    CASE WHEN array_contains(collect_set(in_is_holiday), true)
        THEN true
        ELSE false
        END in_is_holiday,
    CASE WHEN array_contains(collect_set(in_is_longweekends), true)
        THEN true
        ELSE false
        END in_is_longweekends
FROM f
INNER JOIN agents a
ON f.agent = a.id
INNER JOIN airports p
ON f.querydestinationplace = p.airportId
GROUP BY
    querydatetime,
    key


然后我的说明输出:

解析的逻辑计划

== Parsed Logical Plan ==
CTE [f]
: +- 'SubqueryAlias f
: +- 'GlobalLimit 10
: +- 'LocalLimit 10
: +- 'UnresolvedHint COALESCE, [36]
: +- 'Project ['CONCAT('f.outboundlegid, -, 'f.inboundlegid, -, 'f.agent) AS key#351, 'f.querydatetime, 'f.outboundlegid, 'f.inboundlegid, 'f.agent, 'f.queryoutbounddate, 'f.queryinbounddate, 'f.price, 'f.outdeparture, 'f.outarrival, 'f.indeparture, 'f.inarrival, 'f.querydestinationplace, CASE WHEN (('type = HOLIDAY) && (('out_date >= 'start) && ('out_date <= 'end))) THEN true ELSE false END AS out_is_holiday#352, CASE WHEN (('type = LONG_WEEKENDS) && (('out_date >= 'start) && ('out_date <= 'end))) THEN true ELSE false END AS out_is_longweekends#353, CASE WHEN (('type = HOLIDAY) && (('in_date >= 'start) && ('in_date <= 'end))) THEN true ELSE false END AS in_is_holiday#354, CASE WHEN (('type = LONG_WEEKENDS) && (('in_date >= 'start) && ('in_date <= 'end))) THEN true ELSE false END AS in_is_longweekends#355]
: +- 'Join Cross
: :- 'SubqueryAlias f
: : +- 'UnresolvedRelation `flights`
: +- 'SubqueryAlias h
: +- 'UnresolvedRelation `holidays`
+- 'GlobalLimit 10
+- 'LocalLimit 10
+- 'UnresolvedHint BROADCAST, ['a]
+- 'UnresolvedHint BROADCAST, ['p]
+- 'Aggregate ['querydatetime, 'key], ['key, 'querydatetime, first('outboundlegid, false) AS outboundlegid#320, first('inboundlegid, false) AS inboundlegid#322, first('agent, false) AS agent#324, first('p.countryName, false) AS countryName#326, first('p.airportName, false) AS airportName#328, first('a.name, false) AS agentName#330, first('queryoutbounddate, false) AS queryoutbounddate#332, first('queryinbounddate, false) AS queryinbounddate#334, first('price, false) AS price#336, first('outdeparture, false) AS outdeparture#338, first('outarrival, false) AS outarrival#340, first('indeparture, false) AS indeparture#342, first('inarrival, false) AS inarrival#344, first('querydestinationplace, false) AS querydestinationplace#346, CASE WHEN 'array_contains('collect_set('out_is_holiday), true) THEN true ELSE false END AS out_is_holiday#347, CASE WHEN 'array_contains('collect_set('out_is_longweekends), true) THEN true ELSE false END AS out_is_longweekends#348, CASE WHEN 'array_contains('collect_set('in_is_holiday), true) THEN true ELSE false END AS in_is_holiday#349, CASE WHEN 'array_contains('collect_set('in_is_longweekends), true) THEN true ELSE false END AS in_is_longweekends#350]
+- 'Join Inner, ('f.querydestinationplace = 'p.airportId)
:- 'Join Inner, ('f.agent = 'a.id)
: :- 'UnresolvedRelation `f`
: +- 'SubqueryAlias a
: +- 'UnresolvedRelation `agents`
+- 'SubqueryAlias p
+- 'UnresolvedRelation `airports`


逻辑计划分析

== Analyzed Logical Plan ==
key: string, querydatetime: date, outboundlegid: string, inboundlegid: string, agent: string, countryName: string, airportName: string, agentName: string, queryoutbounddate: string, queryinbounddate: string, price: string, outdeparture: string, outarrival: string, indeparture: string, inarrival: string, querydestinationplace: int, out_is_holiday: boolean, out_is_longweekends: boolean, in_is_holiday: boolean, in_is_longweekends: boolean
GlobalLimit 10
+- LocalLimit 10
+- Aggregate [querydatetime#207, key#351], [key#351, querydatetime#207, first(outboundlegid#184, false) AS outboundlegid#320, first(inboundlegid#185, false) AS inboundlegid#322, first(agent#181, false) AS agent#324, first(countryName#24, false) AS countryName#326, first(airportName#22, false) AS airportName#328, first(name#74, false) AS agentName#330, first(queryoutbounddate#177, false) AS queryoutbounddate#332, first(queryinbounddate#178, false) AS queryinbounddate#334, first(price#183, false) AS price#336, first(outdeparture#186, false) AS outdeparture#338, first(outarrival#187, false) AS outarrival#340, first(indeparture#196, false) AS indeparture#342, first(inarrival#197, false) AS inarrival#344, first(querydestinationplace#206, false) AS querydestinationplace#346, CASE WHEN array_contains(collect_set(out_is_holiday#352, 0, 0), true) THEN true ELSE false END AS out_is_holiday#347, CASE WHEN array_contains(collect_set(out_is_longweekends#353, 0, 0), true) THEN true ELSE false END AS out_is_longweekends#348, CASE WHEN array_contains(collect_set(in_is_holiday#354, 0, 0), true) THEN true ELSE false END AS in_is_holiday#349, CASE WHEN array_contains(collect_set(in_is_longweekends#355, 0, 0), true) THEN true ELSE false END AS in_is_longweekends#350]
+- Join Inner, (querydestinationplace#206 = cast(airportId#38 as int))
:- Join Inner, (agent#181 = id#83)
: :- SubqueryAlias f
: : +- GlobalLimit 10
: : +- LocalLimit 10
: : +- Project [concat(outboundlegid#184, -, inboundlegid#185, -, agent#181) AS key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, CASE WHEN ((type#57 = HOLIDAY) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_holiday#352, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_longweekends#353, CASE WHEN ((type#57 = HOLIDAY) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_holiday#354, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_longweekends#355]
: : +- Join Cross
: : :- SubqueryAlias f
: : : +- SubqueryAlias flights
: : : +- Project [Id#174, QueryTaskId#175, QueryOriginPlace#176, QueryOutboundDate#177, QueryInboundDate#178, QueryCabinClass#179, QueryCurrency#180, Agent#181, QuoteAgeInMinutes#182, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, OutDuration#188, OutJourneyMode#189, OutStops#190, OutCarriers#191, OutOperatingCarriers#192, NumberOutStops#193, NumberOutCarriers#194, NumberOutOperatingCarriers#195, InDeparture#196, InArrival#197, ... 12 more fields]
: : : +- Project [Id#174, QueryTaskId#175, QueryOriginPlace#176, QueryOutboundDate#177, QueryInboundDate#178, QueryCabinClass#179, QueryCurrency#180, Agent#181, QuoteAgeInMinutes#182, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, OutDuration#188, OutJourneyMode#189, OutStops#190, OutCarriers#191, OutOperatingCarriers#192, NumberOutStops#193, NumberOutCarriers#194, NumberOutOperatingCarriers#195, InDeparture#196, InArrival#197, ... 11 more fields]
: : : +- LogicalRDD [Id#174, QueryTaskId#175, QueryOriginPlace#176, QueryOutboundDate#177, QueryInboundDate#178, QueryCabinClass#179, QueryCurrency#180, Agent#181, QuoteAgeInMinutes#182, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, OutDuration#188, OutJourneyMode#189, OutStops#190, OutCarriers#191, OutOperatingCarriers#192, NumberOutStops#193, NumberOutCarriers#194, NumberOutOperatingCarriers#195, InDeparture#196, InArrival#197, ... 10 more fields]
: : +- SubqueryAlias h
: : +- SubqueryAlias holidays
: : +- LogicalRDD [start#55, end#56, type#57]
: +- ResolvedHint isBroadcastable=true
: +- SubqueryAlias a
: +- SubqueryAlias agents
: +- Project [cast(id#73L as string) AS id#83, name#74]
: +- Project [id#73L, name#74]
: +- LogicalRDD [id#73L, name#74, type#75]
+- ResolvedHint isBroadcastable=true
+- SubqueryAlias p
+- SubqueryAlias airports
+- Project [cast(airportId#18L as string) AS airportId#38, countryName#24, cityName#23, airportName#22]
+- Project [airportId#18L, countryName#24, cityName#23, airportName#22]
+- LogicalRDD [airportId#18L, cityId#19L, countryId#20L, airportCode#21, airportName#22, cityName#23, countryName#24]

== Optimized Logical Plan ==
GlobalLimit 10
+- LocalLimit 10
+- Aggregate [querydatetime#207, key#351], [key#351, querydatetime#207, first(outboundlegid#184, false) AS outboundlegid#320, first(inboundlegid#185, false) AS inboundlegid#322, first(agent#181, false) AS agent#324, first(countryName#24, false) AS countryName#326, first(airportName#22, false) AS airportName#328, first(name#74, false) AS agentName#330, first(queryoutbounddate#177, false) AS queryoutbounddate#332, first(queryinbounddate#178, false) AS queryinbounddate#334, first(price#183, false) AS price#336, first(outdeparture#186, false) AS outdeparture#338, first(outarrival#187, false) AS outarrival#340, first(indeparture#196, false) AS indeparture#342, first(inarrival#197, false) AS inarrival#344, first(querydestinationplace#206, false) AS querydestinationplace#346, CASE WHEN array_contains(collect_set(out_is_holiday#352, 0, 0), true) THEN true ELSE false END AS out_is_holiday#347, CASE WHEN array_contains(collect_set(out_is_longweekends#353, 0, 0), true) THEN true ELSE false END AS out_is_longweekends#348, CASE WHEN array_contains(collect_set(in_is_holiday#354, 0, 0), true) THEN true ELSE false END AS in_is_holiday#349, CASE WHEN array_contains(collect_set(in_is_longweekends#355, 0, 0), true) THEN true ELSE false END AS in_is_longweekends#350]
+- Project [key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, out_is_holiday#352, out_is_longweekends#353, in_is_holiday#354, in_is_longweekends#355, name#74, countryName#24, airportName#22]
+- Join Inner, (querydestinationplace#206 = cast(airportId#38 as int))
:- Project [key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, out_is_holiday#352, out_is_longweekends#353, in_is_holiday#354, in_is_longweekends#355, name#74]
: +- Join Inner, (agent#181 = id#83)
: :- Filter (isnotnull(agent#181) && isnotnull(querydestinationplace#206))
: : +- GlobalLimit 10
: : +- LocalLimit 10
: : +- Project [concat(outboundlegid#184, -, inboundlegid#185, -, agent#181) AS key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, CASE WHEN ((type#57 = HOLIDAY) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_holiday#352, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_longweekends#353, CASE WHEN ((type#57 = HOLIDAY) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_holiday#354, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_longweekends#355]
: : +- Join Cross
: : :- Project [QueryOutboundDate#177, QueryInboundDate#178, Agent#181, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, InDeparture#196, InArrival#197, querydestinationplace#206, querydatetime#207, to_date(cast(outdeparture#186 as date)) AS out_date#243, to_date(cast(indeparture#196 as date)) AS in_date#280]
: : : +- LogicalRDD [Id#174, QueryTaskId#175, QueryOriginPlace#176, QueryOutboundDate#177, QueryInboundDate#178, QueryCabinClass#179, QueryCurrency#180, Agent#181, QuoteAgeInMinutes#182, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, OutDuration#188, OutJourneyMode#189, OutStops#190, OutCarriers#191, OutOperatingCarriers#192, NumberOutStops#193, NumberOutCarriers#194, NumberOutOperatingCarriers#195, InDeparture#196, InArrival#197, ... 10 more fields]
: : +- LogicalRDD [start#55, end#56, type#57]
: +- ResolvedHint isBroadcastable=true
: +- Project [cast(id#73L as string) AS id#83, name#74]
: +- Filter (isnotnull(id#73L) && isnotnull(cast(id#73L as string)))
: +- LogicalRDD [id#73L, name#74, type#75]
+- ResolvedHint isBroadcastable=true
+- Project [cast(airportId#18L as string) AS airportId#38, countryName#24, airportName#22]
+- Filter (isnotnull(airportId#18L) && isnotnull(cast(airportId#18L as string)))
+- LogicalRDD [airportId#18L, cityId#19L, countryId#20L, airportCode#21, airportName#22, cityName#23, countryName#24]


身体计划

== Physical Plan ==
CollectLimit 10
+- ObjectHashAggregate(keys=[querydatetime#207, key#351], functions=[first(outboundlegid#184, false), first(inboundlegid#185, false), first(agent#181, false), first(countryName#24, false), first(airportName#22, false), first(name#74, false), first(queryoutbounddate#177, false), first(queryinbounddate#178, false), first(price#183, false), first(outdeparture#186, false), first(outarrival#187, false), first(indeparture#196, false), first(inarrival#197, false), first(querydestinationplace#206, false), collect_set(out_is_holiday#352, 0, 0), collect_set(out_is_longweekends#353, 0, 0), collect_set(in_is_holiday#354, 0, 0), collect_set(in_is_longweekends#355, 0, 0)], output=[key#351, querydatetime#207, outboundlegid#320, inboundlegid#322, agent#324, countryName#326, airportName#328, agentName#330, queryoutbounddate#332, queryinbounddate#334, price#336, outdeparture#338, outarrival#340, indeparture#342, inarrival#344, querydestinationplace#346, out_is_holiday#347, out_is_longweekends#348, in_is_holiday#349, in_is_longweekends#350])
+- ObjectHashAggregate(keys=[querydatetime#207, key#351], functions=[partial_first(outboundlegid#184, false), partial_first(inboundlegid#185, false), partial_first(agent#181, false), partial_first(countryName#24, false), partial_first(airportName#22, false), partial_first(name#74, false), partial_first(queryoutbounddate#177, false), partial_first(queryinbounddate#178, false), partial_first(price#183, false), partial_first(outdeparture#186, false), partial_first(outarrival#187, false), partial_first(indeparture#196, false), partial_first(inarrival#197, false), partial_first(querydestinationplace#206, false), partial_collect_set(out_is_holiday#352, 0, 0), partial_collect_set(out_is_longweekends#353, 0, 0), partial_collect_set(in_is_holiday#354, 0, 0), partial_collect_set(in_is_longweekends#355, 0, 0)], output=[querydatetime#207, key#351, first#413, valueSet#414, first#415, valueSet#416, first#417, valueSet#418, first#419, valueSet#420, first#421, valueSet#422, first#423, valueSet#424, first#425, valueSet#426, first#427, valueSet#428, first#429, valueSet#430, first#431, valueSet#432, first#433, valueSet#434, ... 10 more fields])
+- *Project [key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, out_is_holiday#352, out_is_longweekends#353, in_is_holiday#354, in_is_longweekends#355, name#74, countryName#24, airportName#22]
+- *BroadcastHashJoin [querydestinationplace#206], [cast(airportId#38 as int)], Inner, BuildRight
:- *Project [key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, out_is_holiday#352, out_is_longweekends#353, in_is_holiday#354, in_is_longweekends#355, name#74]
: +- *BroadcastHashJoin [agent#181], [id#83], Inner, BuildRight
: :- *Filter (isnotnull(agent#181) && isnotnull(querydestinationplace#206))
: : +- *GlobalLimit 10
: : +- Exchange SinglePartition
: : +- *LocalLimit 10
: : +- *Project [concat(outboundlegid#184, -, inboundlegid#185, -, agent#181) AS key#351, querydatetime#207, outboundlegid#184, inboundlegid#185, agent#181, queryoutbounddate#177, queryinbounddate#178, price#183, outdeparture#186, outarrival#187, indeparture#196, inarrival#197, querydestinationplace#206, CASE WHEN ((type#57 = HOLIDAY) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_holiday#352, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((out_date#243 >= start#55) && (out_date#243 <= end#56))) THEN true ELSE false END AS out_is_longweekends#353, CASE WHEN ((type#57 = HOLIDAY) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_holiday#354, CASE WHEN ((type#57 = LONG_WEEKENDS) && ((in_date#280 >= start#55) && (in_date#280 <= end#56))) THEN true ELSE false END AS in_is_longweekends#355]
: : +- CartesianProduct
: : :- *Project [QueryOutboundDate#177, QueryInboundDate#178, Agent#181, Price#183, OutboundLegId#184, InboundLegId#185, OutDeparture#186, OutArrival#187, InDeparture#196, InArrival#197, querydestinationplace#206, querydatetime#207, to_date(cast(outdeparture#186 as date)) AS out_date#243, to_date(cast(indeparture#196 as date)) AS in_date#280]
: : : +- Scan ExistingRDD[Id#174,QueryTaskId#175,QueryOriginPlace#176,QueryOutboundDate#177,QueryInboundDate#178,QueryCabinClass#179,QueryCurrency#180,Agent#181,QuoteAgeInMinutes#182,Price#183,OutboundLegId#184,InboundLegId#185,OutDeparture#186,OutArrival#187,OutDuration#188,OutJourneyMode#189,OutStops#190,OutCarriers#191,OutOperatingCarriers#192,NumberOutStops#193,NumberOutCarriers#194,NumberOutOperatingCarriers#195,InDeparture#196,InArrival#197,... 10 more fields]
: : +- Scan ExistingRDD[start#55,end#56,type#57]
: +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, string, true]))
: +- *Project [cast(id#73L as string) AS id#83, name#74]
: +- *Filter (isnotnull(id#73L) && isnotnull(cast(id#73L as string)))
: +- Scan ExistingRDD[id#73L,name#74,type#75]
+- BroadcastExchange HashedRelationBroadcastMode(List(cast(cast(input[0, string, true] as int) as bigint)))
+- *Project [cast(airportId#18L as string) AS airportId#38, countryName#24, airportName#22]
+- *Filter (isnotnull(airportId#18L) && isnotnull(cast(airportId#18L as string)))
+- Scan ExistingRDD[airportId#18L,cityId#19L,countryId#20L,airportCode#21,airportName#22,cityName#23,countryName#24]


我可以了解每种计划的目的吗?像有什么区别?我什么时候应该看哪个?

某些步骤意味着什么?例如。项目,扫描,BroadcastExchange,本地限制与全局限制。我应该注意哪些常见的事情?例如。在MySQL中解释,全表扫描可能表明我应该使用某种索引。

我应该如何读取输出?是自上而下的吗?

最佳答案

逻辑计划是代表模式和数据的树。这些树通过催化剂框架进行操作和优化。

逻辑计划分为三种:
1解析的逻辑计划。
2分析的逻辑计划
3优化逻辑计划。

经过分析的逻辑计划要通过一系列规则来解决。然后,产生优化的逻辑计划。优化的逻辑计划通常允许Spark插入一组优化规则。您可以为优化的逻辑计划插入自己的规则。
此优化的逻辑计划将转换为``物理计划''以进一步执行。这些计划位于DataFrame API中。

例如,如果您正在使用两个过滤器操作,则另一个逻辑计划之后将进行两次过滤器转换,但在为最终操作创建实际物理计划之前,spark将执行一些优化,但是两个过滤器之间的和条件将创建一个过滤器转换。

有关更多详细信息,请阅读火花催化剂框架

就您的问题而言,只需检查物理计划以进行任何调试,因为这是实际计划。 Spark执行。

关于optimization - 需要帮助了解PySpark解释输出,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/54614135/

10-12 23:19