Using Tulips¶

This tutorial covers the core data structures of the Tulip library: TulipSeries and TulipCollection. These classes provide a powerful framework for handling financial and economic time series data with rich metadata and analytical capabilities.

Table of Contents¶

TulipSeries: Enhanced Time Series
TulipCollection: Multi-Series Container
Creating Collections from Data Sources
Dashboard Generation
Data Persistence
Advanced Usage

TulipSeries: Enhanced Time Series¶

TulipSeries is a thin wrapper around pandas.Series that adds domain-specific metadata and helper methods for financial/economic analysis.

Core Features¶

Time-indexed data: Inherits all pandas.Series functionality
Rich metadata: Stores frequency, source information, units, dates
Domain-specific methods: summary(), plot(), and analytical helpers
Automatic metadata tracking: Last updated timestamps, data provenance

Key Attributes¶

# Core data structure
series.time_series    # The underlying pandas Series
series.id            # Unique identifier (e.g., "UNRATE")
series.title         # Human-readable name (e.g., "Unemployment Rate")
series.last_updated  # When data was last refreshed
series.info          # Metadata container (units, dates, source, etc.)

Essential Methods¶

summary(): Statistical overview with latest values, changes, Z-scores
plot(): Quick visualization using internal plotting functions
Standard pandas operations: All Series methods work normally

TulipCollection: Multi-Series Container¶

TulipCollection bundles multiple TulipSeries objects for batch operations and comparative analysis.

Primary Use Cases¶

Dashboard generation: Create summary tables and plots for related indicators
Batch analysis: Apply operations across multiple time series
Data pipelines: Feed collections into analytical workflows
Comparative studies: Analyze relationships between economic indicators

Key Methods¶

dashboard.table(): Formatted summary table for all series
dashboard.plots(): Multi-series visualizations with customization
save() / load(): Persistence to/from pickle files
Index access: collection[i] to access individual series

# Import necessary modules
from tulip.core.collection import TulipCollection
from tulip.data.fred import FredClient

# Display all outputs in cells
from IPython.core.interactiveshell import InteractiveShell

InteractiveShell.ast_node_interactivity = "all"

# Create a sample collection using FRED data
# BOGMBASE = Monetary Base, BOGMBBM = Money Supply M1, CPIM = Consumer Price Index
sample_collection = FredClient.create_collection(
    codes=["BOGMBASE", "BOGMBBM", "CPIAUCSL"]
)

print("Collection created successfully!")
print(f"Number of series in collection: {len(sample_collection)}")
print(f"Collection type: {type(sample_collection)}")

# Examine the first series
first_series = sample_collection[0]
print(f"\nFirst series details:")
print(f"ID: {first_series.id}")
print(f"Title: {first_series.info.title}")
print(f"Last updated: {first_series.last_updated}")
print(f"Data points: {len(first_series.ts)}")
print(f"Date range: {first_series.ts.index.min()} to {first_series.ts.index.max()}")

Creating Collections from Data Sources¶

TulipCollections are typically created through data client classes that automatically handle data retrieval and TulipSeries creation.

FRED Data Client¶

The Federal Reserve Economic Data (FRED) client provides access to thousands of economic time series."

sample_collection.dashboard.table()

Dashboard Generation¶

TulipCollection’s dashboard capabilities provide powerful tools for creating summary tables and visualizations.

Dashboard Table¶

The dashboard.table() method creates a formatted summary showing key statistics for all series in the collection."

# Customize series titles for better display
sample_collection[0].info.title = "Monetary Base (Billions $)"
sample_collection[1].info.title = "Money Supply M1 (Billions $)"
sample_collection[2].info.title = "Consumer Price Index"

# Generate dashboard table
print("Dashboard Table for Economic Indicators:")
dashboard_table = sample_collection.dashboard.table()
dashboard_table

Working with Individual TulipSeries¶

Let’s explore the capabilities of individual TulipSeries objects within our collection."

# Access individual series from the collection
monetary_base = sample_collection[0]  # BOGMBASE
money_supply = sample_collection[1]  # BOGMBBM
cpi = sample_collection[2]  # CPIM

print("=== TulipSeries Attributes ===")
print(f"Monetary Base ID: {monetary_base.id}")
print(f"Monetary Base Title: {monetary_base.info.title}")
print(f"Units: {getattr(monetary_base.info, 'units', 'Not specified')}")
print(f"Frequency: {getattr(monetary_base.info, 'frequency', 'Not specified')}")

print(f"\n=== Latest Data Points ===")
print(f"Monetary Base (latest): {monetary_base.ts.iloc[-1]:,.0f}")
print(f"Money Supply M1 (latest): {money_supply.ts.iloc[-1]:,.0f}")
print(f"CPI (latest): {cpi.ts.iloc[-1]:.1f}")

print(f"\n=== Data Access (pandas compatibility) ===")
print("Last 3 CPI values:")
print(cpi.ts.tail(3))

TulipSeries Summary Method¶

The summary() method provides key statistics and insights for a time series."

# Demonstrate the summary() method
print("=== CPI Summary ===")
try:
    cpi_summary = cpi.summary()
    print(cpi_summary)
except Exception as e:
    print(f"Summary method error: {e}")
    print("Computing manual summary statistics:")

    # Manual summary statistics
    latest_value = cpi.ts.iloc[-1]
    prev_month = cpi.ts.iloc[-2] if len(cpi.ts) > 1 else latest_value
    prev_year = cpi.ts.iloc[-13] if len(cpi.ts) > 12 else latest_value

    monthly_change = ((latest_value / prev_month) - 1) * 100
    annual_change = ((latest_value / prev_year) - 1) * 100

    print(f"Latest Value: {latest_value:.2f}")
    print(f"Monthly Change: {monthly_change:.2f}%")
    print(f"Annual Change: {annual_change:.2f}%")
    print(f"Latest Date: {cpi.ts.index[-1]}")

You can save this object to a file for later use:

sample_collection.save("sample_collection.pickle")

Data Persistence¶

TulipCollections can be saved and loaded using pickle format for easy data persistence and sharing."

Dashboard Plots¶

The dashboard.plots() method creates multi-series visualizations with various customization options."

# Create plots with customization options
print("Generating dashboard plots...")

# Basic plots
try:
    plots = sample_collection.dashboard.plots()
    plots
except Exception as e:
    print(f"Error generating plots: {e}")
    print("This may be due to environment limitations or missing plot configuration.")

load_back = TulipCollection.load("sample_collection.pickle")

# Verify the loaded collection
print("=== Loaded Collection Verification ===")
print(f"Original collection length: {len(sample_collection)}")
print(f"Loaded collection length: {len(load_back)}")
print(f"Data integrity check: {load_back[0].id == sample_collection[0].id}")

# Compare a few data points
original_latest = sample_collection[0].ts.iloc[-1]
loaded_latest = load_back[0].ts.iloc[-1]
print(f"Data values match: {original_latest == loaded_latest}")

print(f"\nLoaded collection series IDs:")
for i, series in enumerate(load_back):
    print(f"  {i}: {series.id} - {series.info.title}")

load_back.dashboard.plots()

Advanced Usage¶

Here are some advanced techniques for working with TulipSeries and TulipCollections."

# Advanced Usage Examples

# 1. Convert collection to pandas DataFrame
print("=== Converting to DataFrame ===")
try:
    df = load_back.ts  # .ts property converts to DataFrame
    print(f"DataFrame shape: {df.shape}")
    print(f"DataFrame columns: {list(df.columns)}")
    print("First few rows:")
    print(df.head())
except Exception as e:
    print(f"DataFrame conversion error: {e}")

# 2. Customize metadata for better analysis
print(f"\n=== Metadata Customization ===")
cpi_series = load_back[2]
print(f"Original title: {cpi_series.info.title}")

# Customize metadata
cpi_series.info.quote_units = "%"
if hasattr(cpi_series.info, "frequency"):
    print(f"Frequency: {cpi_series.info.frequency}")

# 3. Mathematical operations (pandas compatibility)
print(f"\n=== Mathematical Operations ===")
monetary_series = load_back[0]
print(f"Latest value: ${monetary_series.ts.iloc[-1]:,.0f} billion")

# Calculate year-over-year growth
if len(monetary_series.ts) > 12:
    yoy_growth = (monetary_series.ts.iloc[-1] / monetary_series.ts.iloc[-13] - 1) * 100
    print(f"Year-over-year growth: {yoy_growth:.1f}%")

# Calculate 6-month moving average
ma_6m = monetary_series.ts.rolling(6).mean()
print(f"6-month MA (latest): ${ma_6m.iloc[-1]:,.0f} billion")

Dashboard Plot Customization¶

The dashboard.plots() method supports various parameters for customizing visualizations:

# Common plot customization options:
collection.dashboard.plots(
    hlines=50,              # Add horizontal reference lines (e.g., PMI 50 line)
    years_limit=3,          # Limit time range to last 3 years
    mma=12,                 # Add 12-period moving average
    tick_suffix='%',        # Add percentage suffix to y-axis
    show_0=True            # Include zero line in plots
)
```"

Creating Collections from Multiple Data Sources¶

You can create collections from different data providers:

# Bloomberg example
from tulip.data.bloomberg import BloombergClient
bb = BloombergClient()
pmi_collection = bb.create_collection(['ISM PRCM Index', 'NAPMPMI Index'])

# Haver example  
from tulip.data.haver import HaverClient
hv = HaverClient()
employment_collection = hv.create_collection(['LRMANUA@USECON', 'CECIINJC@USECON'])

# Manual collection creation
from tulip.core import TulipSeries, TulipCollection
manual_series_list = [series1, series2, series3]  # Your TulipSeries objects
manual_collection = TulipCollection(manual_series_list)
```"

Good/Bad Value Interpretation¶

Collections can specify whether higher values are good or bad for interpretation:

# Set interpretation for specific series (1 = good, -1 = bad)
collection.good_is['INJCJC   Index'] = -1  # Unemployment claims (higher = worse)
collection.good_is['GDP Index'] = 1         # GDP growth (higher = better)
```"

Summary¶

This tutorial covered the essential aspects of TulipSeries and TulipCollection:

Key Takeaways¶

TulipSeries enhances pandas.Series with financial metadata and domain-specific methods
TulipCollection enables batch operations and dashboard generation for multiple series
Data clients (FRED, Bloomberg, Haver) provide easy collection creation
Dashboard tools generate summary tables and customizable visualizations
Persistence allows saving/loading collections for data sharing and workflow continuity
Full pandas compatibility ensures seamless integration with existing data analysis workflows

Best Practices¶

Use descriptive titles for better dashboard readability
Leverage the good_is attribute for proper value interpretation
Take advantage of persistence for expensive data operations
Customize dashboard plots based on data characteristics (reference lines, time ranges, etc.)
Use the .ts property when you need raw pandas DataFrame functionality

The Tulip library provides a robust foundation for financial and economic data analysis, combining the flexibility of pandas with domain-specific enhancements for professional financial research workflows."