Skip to content

timeseriesdatasets Documentation

Welcome

The timeseriesdatasets package provides a comprehensive collection of time series datasets. It includes extensive data on topics such as stock market prices, climate and weather data, energy consumption, air passengers, cryptocurrency, sensor readings, sales records, and global temperature anomalies.

The package contains daily stock price records, climate and weather observations, energy generation statistics, cryptocurrency OHLCV data, retail sales data, power consumption records, meat price series, and much more.

Philosophy

The author's vision is to create specialized dataset packages focused on specific themes and topics. Instead of searching through multiple generic data packages to find relevant datasets, users can go directly to a thematic package where all datasets are carefully curated around a particular subject.

In the case of timeseriesdatasets, every dataset is exclusively focused on time series analysis, forecasting, and statistical modeling, making it the go-to resource for researchers, data scientists, statisticians, econometricians, and students working in the fields of finance, climate science, energy, transportation, and machine learning.

Cross-Platform Ecosystem

timeseriesdatasets has a sibling package in the R ecosystem called timeSeriesDataSets, maintaining consistency across programming languages and ensuring that users can work with the same high-quality datasets whether they prefer Python or R.

This cross-platform approach reflects our commitment to making specialized datasets accessible to the widest possible audience, regardless of their preferred data analysis environment.

Getting Started

Installation

The easiest way to install timeseriesdatasets is directly from PyPI:

pip install timeseriesdatasets

From GitHub (Latest Development Version)

To get the latest development version with the newest features and bug fixes:

pip install git+https://github.com/lightbluetitan/timeseriesdatasets-py

Quick Start Tutorial

1. Import the Package

import timeseriesdatasets as ts

2. List Available Datasets

See all datasets included in the package:

# Get list of all datasets
datasets = ts.list_datasets()
print(datasets)

3. Load a Dataset

Load any dataset as a pandas DataFrame:

# Load microsoft_stock
df = ts.load_dataset('microsoft_stock')

# Display first rows
print(df.head())

# Check dataset dimensions
print(f"Shape: {df.shape}")

4. Describe a dataset


# Describe a dataset
print(ts.describe("tesla_stock"))

Basic Concepts

Dataset Naming Convention

All dataset names in timeseriesdatasets follow a consistent naming pattern:

  • Lowercase with underscores: yahoo_stock
  • Descriptive names that reflect content

Some Datasets available at timeseriesdatasets

Every dataset is exclusively focused on time series for data analysis, forecasting, statistical analysis, and machine learning:

  • microsoft_stock: Microsoft stock price time series from 2015 to 2021.
  • yahoo_stock: Yahoo stock price time series data for forecasting.
  • tesla_stock: Tesla stock price time series data for daily trading and forecasting.
  • nvidia_stock: NVIDIA stock price time series from 1999 to 2025.

Data Licenses

All datasets maintain their original open-source licenses:

  • Most datasets use CC0: Public Domain (free for any use)
  • Some use MIT License or Apache 2.0
  • The timeseriesdatasets package itself is licensed under MIT