首先,我不是Python开发人员。但是我需要合成数据,并试图使用合成数据库(https://github.com/sdv-dev/SDV)。
我已经安装了Python 3.7(在Windows上,我暂时正在笔记本电脑上进行此操作,同时了解其工作方式)。
python --version
Python 3.7.6
我可以使用pip下载sdv软件包,并且可以运行演示代码的前几行来加载和查看元数据和演示表。但是,当我进入演示中的这些行时:
sdv = SDV()
sdv.fit(metadata, tables)
我收到以下错误:
TypeError:无法将类似datetime的日期从[datetime64 [ns]]分配为[int32]
我根本没有修改过git中的任何代码,也没有尝试过自己的任何代码。实际上,我只是想按照自述文件中的说明操作演示。我刚刚安装了该软件包,并正在研究第一个示例。有人尝试过并且有同样的问题吗?关于如何解决此错误的任何想法?
完整的堆栈跟踪为:
sdv.fit(metadata, tables)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\tools\Python\3.7\lib\site-packages\sdv\sdv.py", line 69, in fit
self.modeler.model_database(tables)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 128, in model_database
self.cpa(table_name, tables)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 92, in cpa
extended = self.metadata.transform(table_name, table)
File "C:\tools\Python\3.7\lib\site-packages\sdv\metadata.py", line 477, in transform
hyper_transformer.fit(data[fields])
File "C:\tools\Python\3.7\lib\site-packages\rdt\hyper_transformer.py", line 128, in fit
transformer.fit(column)
File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 55, in fit
transformed = self._transform(data)
File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 40, in _transform
integers = datetimes.astype(int).astype(float).values
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\generic.py", line 5691, in astype
**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 531, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 395, in apply
applied = getattr(b, f)(**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 534, in astype
**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 2139, in _astype
return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 633, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\dtypes\cast.py", line 646, in astype_nansafe
to_dtype=dtype))
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]
这是我会话的完整输出:
from sdv import load_demo
metadata, tables = load_demo(metadata=True)
metadata.to_dict()
{
"tables": {
"users": {
"primary_key": "user_id",
"fields": {
"user_id": {
"type": "id",
"subtype": "integer"
},
"country": {
"type": "categorical"
},
"gender": {
"type": "categorical"
},
"age": {
"type": "numerical",
"subtype": "integer"
}
}
},
"sessions": {
"primary_key": "session_id",
"fields": {
"session_id": {
"type": "id",
"subtype": "integer"
},
"user_id": {
"ref": {
"field": "user_id",
"table": "users"
},
"type": "id",
"subtype": "integer"
},
"device": {
"type": "categorical"
},
"os": {
"type": "categorical"
}
}
},
"transactions": {
"primary_key": "transaction_id",
"fields": {
"transaction_id": {
"type": "id",
"subtype": "integer"
},
"session_id": {
"ref": {
"field": "session_id",
"table": "sessions"
},
"type": "id",
"subtype": "integer"
},
"timestamp": {
"type": "datetime",
"format": "%Y-%m-%d"
},
"amount": {
"type": "numerical",
"subtype": "float"
},
"approved": {
"type": "boolean"
}
}
}
}
}
>>> tables
{'users': user_id country gender age
0 0 USA M 34
1 1 UK F 23
2 2 ES None 44
3 3 UK M 22
4 4 USA F 54
5 5 DE M 57
6 6 BG F 45
7 7 ES None 41
8 8 FR F 23
9 9 UK None 30, 'sessions': session_id user_id device os
0 0 0 mobile android
1 1 1 tablet ios
2 2 1 tablet android
3 3 2 mobile android
4 4 4 mobile ios
5 5 5 mobile android
6 6 6 mobile ios
7 7 6 tablet ios
8 8 6 mobile ios
9 9 8 tablet ios, 'transactions': transaction_id session_id timestamp amount approved
0 0 0 2019-01-01 12:34:32 100.0 True
1 1 0 2019-01-01 12:42:21 55.3 True
2 2 1 2019-01-07 17:23:11 79.5 True
3 3 3 2019-01-10 11:08:57 112.1 False
4 4 5 2019-01-10 21:54:08 110.0 False
5 5 5 2019-01-11 11:21:20 76.3 True
6 6 7 2019-01-22 14:44:10 89.5 True
7 7 8 2019-01-23 10:14:09 132.1 False
8 8 9 2019-01-27 16:09:17 68.0 True
9 9 9 2019-01-29 12:10:48 99.9 True}
metadata.visualize()
<graphviz.dot.Digraph object at 0x00000196E8755488>
from sdv import SDV
sdv = SDV()
sdv.fit(metadata, tables)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\tools\Python\3.7\lib\site-packages\sdv\sdv.py", line 69, in fit
self.modeler.model_database(tables)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 128, in model_database
self.cpa(table_name, tables)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
child_table = self.cpa(child_name, tables, child_key)
File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 92, in cpa
extended = self.metadata.transform(table_name, table)
File "C:\tools\Python\3.7\lib\site-packages\sdv\metadata.py", line 477, in transform
hyper_transformer.fit(data[fields])
File "C:\tools\Python\3.7\lib\site-packages\rdt\hyper_transformer.py", line 128, in fit
transformer.fit(column)
File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 55, in fit
transformed = self._transform(data)
File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 40, in _transform
integers = datetimes.astype(int).astype(float).values
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\generic.py", line 5691, in astype
**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 531, in astype
return self.apply('astype', dtype=dtype, **kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 395, in apply
applied = getattr(b, f)(**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 534, in astype
**kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 2139, in _astype
return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 633, in _astype
values = astype_nansafe(values.ravel(), dtype, copy=True)
File "C:\tools\Python\3.7\lib\site-packages\pandas\core\dtypes\cast.py", line 646, in astype_nansafe
to_dtype=dtype))
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]
最佳答案
实际上,我找到了一个解决方案-不是Python开发人员,不确定是否是最好的解决方案,但可以清除错误。
在第41行的datetime.py代码中,我进行了更改:
integers = datetimes.astype(int).astype(float).values
至
integers = datetimes.astype(np.int64).astype(float).values
我想虽然可以在不更改项目代码的情况下解决此问题(这不是我的代码,这是我下载的软件包),但是我现在可以继续研究。
关于python - 尝试做SDV(综合数据保险库)演示并得到错误:TypeError:无法将类似datetime的日期从[datetime64 [ns]]分配为[int32],我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/60062491/