本文介绍了如何在pyspark中关闭科学计数法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
作为一些聚合的结果,我提出了以下sparkdataframe:
As the result of some aggregation i come up with following sparkdataframe:
------------+-----------------+-----------------+
|sale_user_id|gross_profit |total_sale_volume|
+------------+-----------------+-----------------+
| 20569| -3322960.0| 2.12569482E8|
| 24269| -1876253.0| 8.6424626E7|
| 9583| 0.0| 1.282272E7|
| 11722| 18229.0| 5653149.0|
| 37982| 6077.0| 1181243.0|
| 20428| 1665.0| 7011588.0|
| 41157| 73227.0| 1.18631E7|
| 9993| 0.0| 1481437.0|
| 9030| 8865.0| 4.4133791E7|
| 829| 0.0| 11355.0|
+------------+-----------------+-----------------+
,数据框的架构为:
root
|-- sale_user_id: string (nullable = true)
|-- tapp_gross_profit: double (nullable = true)
|-- total_sale_volume: double (nullable = true)
如何在Gross_profit和total_sale_volume列中禁用科学计数法?
how can i disable scientific notation in each of gross_profit and total_sale_volume columns?
推荐答案
最简单的方法是将双列转换为十进制,并给出适当的精度和规模:
The easiest way is to cast double column to decimal, giving appropriate precision and scale:
df.withColumn('total_sale_volume', df.total_sale_volume.cast(DecimalType(18, 2)))
这篇关于如何在pyspark中关闭科学计数法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!