绘制超过一天的具有不同时间戳和datetime.time格式的时间序列

本文介绍了绘制超过一天的具有不同时间戳和datetime.time格式的时间序列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个包含温度和光传感器读数的数据集.测量时间为 22:35:41 - 04:49:41.

I have two datasets that contain temperature and light sensor readings. The measurements were done from 22:35:41 - 04:49:41.

此数据集的问题在于，从一天到另一天(22:35:41-04:49:41)进行测量时，要根据datetime.date格式绘制测量值.绘图功能自动从00:00开始，并将在00:00之前测量的数据放到绘图的末尾.

The problem with this datasets is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41). The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

Temperature = pd.read_excel("/kaggle/input/Temperature_measurement.xlsx")
Light = pd.read_excel("/kaggle/input/Light_measurement.xlsx")

sns.lineplot(x="Time",y="Light", data = Light)
sns.lineplot(y="Temperature", x="Time", data = Temperature)
plt.show()

这是指向数据集的链接

这是 Jupyter Notebook 的链接

推荐答案

首先，您需要将时间转换为 Pandas 时间戳.熊猫时间戳记本身并不能真正支持时间，它们会为其附加日期，但这很好，因为我们稍后会将该部分隐藏起来.

First you need to convert your times to a Pandas Timestamp. Pandas Timestamps don't really support a time on its own, they will attach a date to them, but that's fine since we'll hide that part later.

我们还需要检测一天的变化，我们可以通过查看时间结束的位置来实现这一点，我们可以通过查看比其前一时间更小的时间来找到这一点.

We also need to detect day changes, which we can do by looking at where the time wraps, which we can find by looking at a time that's smaller than its predecessor.

我们可以计算累计换行次数，并将该日期数添加到我们的时间戳记中.

We can count the cumulative wraps and add that number of dates to our timestamps.

让我们定义一个函数以获取 datetime.time 对象，将其转换为本地Pandas时间戳(使用1900-01-01的任意日期，这是Pandas的默认日期)并调整换行的一天(所以我们以1900-01-02的最后时刻结束了):

Let's define a function to take the datetime.time objects, convert them to native Pandas Timestamps (using an arbitrary date of 1900-01-01, which is the default for Pandas) and adjusting the day according to the wraps (so we end up with our final times on 1900-01-02):

def normalize_time(series):
    series = pd.to_datetime(series, format="%H:%M:%S")
    series += pd.to_timedelta(series.lt(series.shift()).cumsum(), unit="D")
    return series

现在让我们将它应用到我们的数据帧:

Let's now apply it to our DataFrames:

Light["Time"] = normalize_time(Light["Time"])
Temperature["Time"] = normalize_time(Temperature["Time"])

现在绘制数据看起来是正确的，时间是连续的.除了X刻度线的标签会尝试显示日期，这并不是我们真正关心的日期，因此让我们现在修复该部分.

Plotting the data now will look correct, with the times being continuous. Except that the labels of the X ticks will try to display the dates, which are not really what we care about, so let's fix that part now.

我们可以使用 Matplotlib 的 set_major_formatter 和 DateFormatter 来只包含时间:

We can use Matplotlib's set_major_formatter together with a DateFormatter to include times only:

import matplotlib.dates

ax = plt.subplot()

sns.lineplot(x="Time", y="Light", data=Light)
sns.lineplot(x="Time", y="Temperature", data=Temperature)

ax.xaxis.set_major_formatter(
    matplotlib.dates.DateFormatter("%H:%M")
)

plt.show()

这每小时产生X个刻度，这似乎非常适合此数据集.

This produces X ticks every hour, which seem to be a great fit for this data set.

这篇关于绘制超过一天的具有不同时间戳和datetime.time格式的时间序列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！