首先,我不是Python开发人员。但是我需要合成数据,并试图使用合成数据库(https://github.com/sdv-dev/SDV)。

我已经安装了Python 3.7(在Windows上,我暂时正在笔记本电脑上进行此操作,同时了解其工作方式)。

python --version
Python 3.7.6

我可以使用pip下载sdv软件包,并且可以运行演示代码的前几行来加载和查看元数据和演示表。但是,当我进入演示中的这些行时:

sdv = SDV()
sdv.fit(metadata, tables)


我收到以下错误:

TypeError:无法将类似datetime的日期从[datetime64 [ns]]分配为[int32]

我根本没有修改过git中的任何代码,也没有尝试过自己的任何代码。实际上,我只是想按照自述文件中的说明操作演示。我刚刚安装了该软件包,并正在研究第一个示例。有人尝试过并且有同样的问题吗?关于如何解决此错误的任何想法?

完整的堆栈跟踪为:

    sdv.fit(metadata, tables)


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\tools\Python\3.7\lib\site-packages\sdv\sdv.py", line 69, in fit
    self.modeler.model_database(tables)
  File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 128, in model_database
    self.cpa(table_name, tables)
  File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
    child_table = self.cpa(child_name, tables, child_key)
  File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
    child_table = self.cpa(child_name, tables, child_key)
  File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 92, in cpa
    extended = self.metadata.transform(table_name, table)
  File "C:\tools\Python\3.7\lib\site-packages\sdv\metadata.py", line 477, in transform
    hyper_transformer.fit(data[fields])
  File "C:\tools\Python\3.7\lib\site-packages\rdt\hyper_transformer.py", line 128, in fit
    transformer.fit(column)
  File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 55, in fit
    transformed = self._transform(data)
  File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 40, in _transform
    integers = datetimes.astype(int).astype(float).values
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\generic.py", line 5691, in astype
    **kwargs)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 531, in astype
    return self.apply('astype', dtype=dtype, **kwargs)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 395, in apply
    applied = getattr(b, f)(**kwargs)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 534, in astype
    **kwargs)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 2139, in _astype
    return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 633, in _astype
    values = astype_nansafe(values.ravel(), dtype, copy=True)
  File "C:\tools\Python\3.7\lib\site-packages\pandas\core\dtypes\cast.py", line 646, in astype_nansafe
    to_dtype=dtype))
TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]


这是我会话的完整输出:


    from sdv import load_demo
     metadata, tables = load_demo(metadata=True)
     metadata.to_dict()



    {
      "tables": {
        "users": {
          "primary_key": "user_id",
          "fields": {
            "user_id": {
              "type": "id",
              "subtype": "integer"
            },
            "country": {
              "type": "categorical"
            },
            "gender": {
              "type": "categorical"
            },
            "age": {
              "type": "numerical",
              "subtype": "integer"
            }
          }
        },
        "sessions": {
          "primary_key": "session_id",
          "fields": {
            "session_id": {
              "type": "id",
              "subtype": "integer"
            },
            "user_id": {
              "ref": {
                "field": "user_id",
                "table": "users"
              },
              "type": "id",
              "subtype": "integer"
            },
            "device": {
              "type": "categorical"
            },
            "os": {
              "type": "categorical"
            }
          }
        },
        "transactions": {
          "primary_key": "transaction_id",
          "fields": {
            "transaction_id": {
              "type": "id",
              "subtype": "integer"
            },
            "session_id": {
              "ref": {
                "field": "session_id",
                "table": "sessions"
              },
              "type": "id",
              "subtype": "integer"
            },
            "timestamp": {
              "type": "datetime",
              "format": "%Y-%m-%d"
            },
            "amount": {
              "type": "numerical",
              "subtype": "float"
            },
            "approved": {
              "type": "boolean"
            }
          }
        }
      }
    }



    >>> tables




    {'users':    user_id country gender  age
    0        0     USA      M   34
    1        1      UK      F   23
    2        2      ES   None   44
    3        3      UK      M   22
    4        4     USA      F   54
    5        5      DE      M   57
    6        6      BG      F   45
    7        7      ES   None   41
    8        8      FR      F   23
    9        9      UK   None   30, 'sessions':    session_id  user_id  device       os
    0           0        0  mobile  android
    1           1        1  tablet      ios
    2           2        1  tablet  android
    3           3        2  mobile  android
    4           4        4  mobile      ios
    5           5        5  mobile  android
    6           6        6  mobile      ios
    7           7        6  tablet      ios
    8           8        6  mobile      ios
    9           9        8  tablet      ios, 'transactions':    transaction_id  session_id           timestamp  amount  approved
    0               0           0 2019-01-01 12:34:32   100.0      True
    1               1           0 2019-01-01 12:42:21    55.3      True
    2               2           1 2019-01-07 17:23:11    79.5      True
    3               3           3 2019-01-10 11:08:57   112.1     False
    4               4           5 2019-01-10 21:54:08   110.0     False
    5               5           5 2019-01-11 11:21:20    76.3      True
    6               6           7 2019-01-22 14:44:10    89.5      True
    7               7           8 2019-01-23 10:14:09   132.1     False
    8               8           9 2019-01-27 16:09:17    68.0      True
    9               9           9 2019-01-29 12:10:48    99.9      True}




    metadata.visualize()


<graphviz.dot.Digraph object at 0x00000196E8755488>


     from sdv import SDV
     sdv = SDV()
     sdv.fit(metadata, tables)


    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\tools\Python\3.7\lib\site-packages\sdv\sdv.py", line 69, in fit
        self.modeler.model_database(tables)
      File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 128, in model_database
        self.cpa(table_name, tables)
      File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
        child_table = self.cpa(child_name, tables, child_key)
      File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 99, in cpa
        child_table = self.cpa(child_name, tables, child_key)
      File "C:\tools\Python\3.7\lib\site-packages\sdv\modeler.py", line 92, in cpa
        extended = self.metadata.transform(table_name, table)
      File "C:\tools\Python\3.7\lib\site-packages\sdv\metadata.py", line 477, in transform
        hyper_transformer.fit(data[fields])
      File "C:\tools\Python\3.7\lib\site-packages\rdt\hyper_transformer.py", line 128, in fit
        transformer.fit(column)
      File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 55, in fit
        transformed = self._transform(data)
      File "C:\tools\Python\3.7\lib\site-packages\rdt\transformers\datetime.py", line 40, in _transform
        integers = datetimes.astype(int).astype(float).values
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\generic.py", line 5691, in astype
        **kwargs)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 531, in astype
        return self.apply('astype', dtype=dtype, **kwargs)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\managers.py", line 395, in apply
        applied = getattr(b, f)(**kwargs)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 534, in astype
        **kwargs)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 2139, in _astype
        return super(DatetimeBlock, self)._astype(dtype=dtype, **kwargs)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\internals\blocks.py", line 633, in _astype
        values = astype_nansafe(values.ravel(), dtype, copy=True)
      File "C:\tools\Python\3.7\lib\site-packages\pandas\core\dtypes\cast.py", line 646, in astype_nansafe
        to_dtype=dtype))
    TypeError: cannot astype a datetimelike from [datetime64[ns]] to [int32]

最佳答案

实际上,我找到了一个解决方案-不是Python开发人员,不确定是否是最好的解决方案,但可以清除错误。

在第41行的datetime.py代码中,我进行了更改:

integers = datetimes.astype(int).astype(float).values




integers = datetimes.astype(np.int64).astype(float).values


我想虽然可以在不更改项目代码的情况下解决此问题(这不是我的代码,这是我下载的软件包),但是我现在可以继续研究。

关于python - 尝试做SDV(综合数据保险库)演示并得到错误:TypeError:无法将类似datetime的日期从[datetime64 [ns]]分配为[int32],我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/60062491/

10-16 18:22
查看更多