问题描述
我有两个包含温度和光传感器读数的数据集.测量时间为 22:35:41 - 04:49:41.
I have two datasets that contain temperature and light sensor readings. The measurements were done from 22:35:41 - 04:49:41.
此数据集的问题在于,从一天到另一天(22:35:41-04:49:41)进行测量时,要根据datetime.date格式绘制测量值.绘图功能自动从00:00开始,并将在00:00之前测量的数据放到绘图的末尾.
The problem with this datasets is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41). The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Temperature = pd.read_excel("/kaggle/input/Temperature_measurement.xlsx")
Light = pd.read_excel("/kaggle/input/Light_measurement.xlsx")
sns.lineplot(x="Time",y="Light", data = Light)
sns.lineplot(y="Temperature", x="Time", data = Temperature)
plt.show()
推荐答案
首先,您需要将时间转换为 Pandas 时间戳.熊猫时间戳记本身并不能真正支持时间,它们会为其附加日期,但这很好,因为我们稍后会将该部分隐藏起来.
First you need to convert your times to a Pandas Timestamp. Pandas Timestamps don't really support a time on its own, they will attach a date to them, but that's fine since we'll hide that part later.
我们还需要检测一天的变化,我们可以通过查看时间结束的位置来实现这一点,我们可以通过查看比其前一时间更小的时间来找到这一点.
We also need to detect day changes, which we can do by looking at where the time wraps, which we can find by looking at a time that's smaller than its predecessor.
我们可以计算累计换行次数,并将该日期数添加到我们的时间戳记中.
We can count the cumulative wraps and add that number of dates to our timestamps.
让我们定义一个函数以获取 datetime.time
对象,将其转换为本地Pandas时间戳(使用1900-01-01的任意日期,这是Pandas的默认日期)并调整换行的一天(所以我们以1900-01-02的最后时刻结束了):
Let's define a function to take the datetime.time
objects, convert them to native Pandas Timestamps (using an arbitrary date of 1900-01-01, which is the default for Pandas) and adjusting the day according to the wraps (so we end up with our final times on 1900-01-02):
def normalize_time(series):
series = pd.to_datetime(series, format="%H:%M:%S")
series += pd.to_timedelta(series.lt(series.shift()).cumsum(), unit="D")
return series
现在让我们将它应用到我们的数据帧:
Let's now apply it to our DataFrames:
Light["Time"] = normalize_time(Light["Time"])
Temperature["Time"] = normalize_time(Temperature["Time"])
现在绘制数据看起来是正确的,时间是连续的.除了X刻度线的标签会尝试显示日期,这并不是我们真正关心的日期,因此让我们现在修复该部分.
Plotting the data now will look correct, with the times being continuous. Except that the labels of the X ticks will try to display the dates, which are not really what we care about, so let's fix that part now.
我们可以使用 Matplotlib 的 set_major_formatter
和 DateFormatter
来只包含时间:
We can use Matplotlib's set_major_formatter
together with a DateFormatter
to include times only:
import matplotlib.dates
ax = plt.subplot()
sns.lineplot(x="Time", y="Light", data=Light)
sns.lineplot(x="Time", y="Temperature", data=Temperature)
ax.xaxis.set_major_formatter(
matplotlib.dates.DateFormatter("%H:%M")
)
plt.show()
这每小时产生X个刻度,这似乎非常适合此数据集.
This produces X ticks every hour, which seem to be a great fit for this data set.
这篇关于绘制超过一天的具有不同时间戳和datetime.time格式的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!