tsam.result¶

tsam.result ¶

Result classes for tsam aggregation.

AccuracyMetrics `dataclass` ¶

Accuracy metrics comparing aggregated to original time series.

Attributes:

Name	Type	Description
`rmse`	`Series`	Root Mean Square Error per column, comparing the original and reconstructed time series point-by-point over time.
`mae`	`Series`	Mean Absolute Error per column, comparing the original and reconstructed time series point-by-point over time.
`rmse_duration`	`Series`	RMSE on duration curves per column. Duration curves are created by sorting values in descending order, so this metric captures how well the aggregation preserves the overall value distribution regardless of temporal ordering.
`rescale_deviations`	`DataFrame`	Rescaling deviation information per column. Contains columns: - deviation_pct: Final deviation percentage after rescaling - converged: Whether rescaling converged within max iterations - iterations: Number of iterations used Only populated if rescaling was enabled, otherwise empty DataFrame.
`weighted_rmse`	`float`	Weighted root-mean-square of per-column RMSE values: `sqrt(sum(rmse_i² * w_i) / sum(w_i))`. Equals the RMSE over all pooled (weighted) residuals. With uniform weights this matches the old `totalAccuracyIndicators()["RMSE"]`.
`weighted_mae`	`float`	Weighted arithmetic mean of per-column MAE values: `sum(mae_i * w_i) / sum(w_i)`.
`weighted_rmse_duration`	`float`	Weighted root-mean-square of per-column duration-curve RMSE values: `sqrt(sum(rmse_dur_i² * w_i) / sum(w_i))`.

Source code in src/tsam/result.py

@dataclass
class AccuracyMetrics:
    """Accuracy metrics comparing aggregated to original time series.

    Attributes
    ----------
    rmse : pd.Series
        Root Mean Square Error per column, comparing the original and
        reconstructed time series point-by-point over time.
    mae : pd.Series
        Mean Absolute Error per column, comparing the original and
        reconstructed time series point-by-point over time.
    rmse_duration : pd.Series
        RMSE on duration curves per column. Duration curves are created
        by sorting values in descending order, so this metric captures
        how well the aggregation preserves the overall value distribution
        regardless of temporal ordering.
    rescale_deviations : pd.DataFrame
        Rescaling deviation information per column. Contains columns:
        - deviation_pct: Final deviation percentage after rescaling
        - converged: Whether rescaling converged within max iterations
        - iterations: Number of iterations used
        Only populated if rescaling was enabled, otherwise empty DataFrame.
    weighted_rmse : float
        Weighted root-mean-square of per-column RMSE values:
        ``sqrt(sum(rmse_i² * w_i) / sum(w_i))``.
        Equals the RMSE over all pooled (weighted) residuals.
        With uniform weights this matches the old ``totalAccuracyIndicators()["RMSE"]``.
    weighted_mae : float
        Weighted arithmetic mean of per-column MAE values:
        ``sum(mae_i * w_i) / sum(w_i)``.
    weighted_rmse_duration : float
        Weighted root-mean-square of per-column duration-curve RMSE values:
        ``sqrt(sum(rmse_dur_i² * w_i) / sum(w_i))``.
    """

    rmse: pd.Series
    mae: pd.Series
    rmse_duration: pd.Series
    rescale_deviations: pd.DataFrame
    weighted_rmse: float
    weighted_mae: float
    weighted_rmse_duration: float

    @property
    def summary(self) -> pd.DataFrame:
        """Summary DataFrame with all metrics per column.

        Returns
        -------
        pd.DataFrame
            DataFrame with columns: rmse, mae, rmse_duration, and deviation_pct
            (if rescaling was enabled). Index is the original column names.
        """
        df = pd.DataFrame(
            {
                "rmse": self.rmse,
                "mae": self.mae,
                "rmse_duration": self.rmse_duration,
            }
        )
        if not self.rescale_deviations.empty:
            df["deviation_pct"] = self.rescale_deviations["deviation_pct"]
        return df

    def __repr__(self) -> str:
        rescale_info = ""
        if not self.rescale_deviations.empty:
            n_failed = (~self.rescale_deviations["converged"]).sum()
            if n_failed > 0:
                max_dev = self.rescale_deviations["deviation_pct"].max()
                rescale_info = f",\n  rescale_failures={n_failed} (max {max_dev:.2f}%)"
        return (
            f"AccuracyMetrics(\n"
            f"  rmse={self.weighted_rmse:.4f} (weighted),\n"
            f"  mae={self.weighted_mae:.4f} (weighted),\n"
            f"  rmse_duration={self.weighted_rmse_duration:.4f} (weighted){rescale_info}\n"
            f")"
        )

summary `property` ¶

summary: DataFrame

Summary DataFrame with all metrics per column.

Returns:

Type	Description
`DataFrame`	DataFrame with columns: rmse, mae, rmse_duration, and deviation_pct (if rescaling was enabled). Index is the original column names.

AggregationResult `dataclass` ¶

Result of time series aggregation.

This class holds all outputs from the aggregation process and provides convenient methods for accessing and exporting the results.

Attributes:

Name	Type	Description
`cluster_representatives`	`DataFrame`	The aggregated typical periods with MultiIndex (cluster, timestep). Each row represents one timestep in one cluster representative.
`cluster_assignments`	`ndarray`	Which cluster each original period belongs to. Length equals the number of original periods. Values are cluster indices (0 to n_clusters-1).
`cluster_weights`	`dict[int, int]`	How many original periods each cluster represents. Keys are cluster indices, values are occurrence counts.
`n_clusters`	`int`	Number of clusters (typical periods).
`n_timesteps_per_period`	`int`	Number of timesteps in each period.
`n_segments`	`int \| None`	Number of segments per period if segmentation was used, else None.
`segment_durations`	`tuple[tuple[int, ...], ...] \| None`	Duration (in timesteps) for each segment in each typical period. Outer tuple has one entry per typical period, inner tuple has duration for each segment. Use for transferring to another aggregation.
`accuracy`	`AccuracyMetrics`	Accuracy metrics comparing reconstructed to original data.
`clustering_duration`	`float`	Time taken for clustering in seconds.
`is_transferred`	`bool`	Whether this result was created by applying a transferred clustering (via `ClusteringResult.apply()`) rather than by clustering this data directly.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.cluster_representatives
                    solar  wind  demand
cluster timestep
0       0           0.12   0.45   0.78
        1           0.15   0.42   0.82
...

>>> result.cluster_weights
{0: 45, 1: 52, 2: 38, ...}

>>> result.accuracy.rmse
solar     0.023
wind      0.041
demand    0.015
dtype: float64

Source code in src/tsam/result.py

@dataclass
class AggregationResult:
    """Result of time series aggregation.

    This class holds all outputs from the aggregation process and provides
    convenient methods for accessing and exporting the results.

    Attributes
    ----------
    cluster_representatives : pd.DataFrame
        The aggregated typical periods with MultiIndex (cluster, timestep).
        Each row represents one timestep in one cluster representative.

    cluster_assignments : np.ndarray
        Which cluster each original period belongs to.
        Length equals the number of original periods.
        Values are cluster indices (0 to n_clusters-1).

    cluster_weights : dict[int, int]
        How many original periods each cluster represents.
        Keys are cluster indices, values are occurrence counts.

    n_clusters : int
        Number of clusters (typical periods).

    n_timesteps_per_period : int
        Number of timesteps in each period.

    n_segments : int | None
        Number of segments per period if segmentation was used, else None.

    segment_durations : tuple[tuple[int, ...], ...] | None
        Duration (in timesteps) for each segment in each typical period.
        Outer tuple has one entry per typical period, inner tuple has
        duration for each segment. Use for transferring to another aggregation.

    accuracy : AccuracyMetrics
        Accuracy metrics comparing reconstructed to original data.

    clustering_duration : float
        Time taken for clustering in seconds.

    is_transferred : bool
        Whether this result was created by applying a transferred clustering
        (via ``ClusteringResult.apply()``) rather than by clustering this data directly.

    Examples
    --------
    >>> result = tsam.aggregate(df, n_clusters=8)
    >>> result.cluster_representatives
                        solar  wind  demand
    cluster timestep
    0       0           0.12   0.45   0.78
            1           0.15   0.42   0.82
    ...

    >>> result.cluster_weights
    {0: 45, 1: 52, 2: 38, ...}

    >>> result.accuracy.rmse
    solar     0.023
    wind      0.041
    demand    0.015
    dtype: float64
    """

    cluster_representatives: pd.DataFrame
    cluster_weights: dict[int, int]
    n_timesteps_per_period: int
    segment_durations: tuple[tuple[int, ...], ...] | None
    accuracy: AccuracyMetrics
    clustering_duration: float
    clustering: ClusteringResult
    is_transferred: bool
    _aggregation: TimeSeriesAggregation = field(repr=False, compare=False)

    @cached_property
    def n_clusters(self) -> int:
        """Number of clusters (typical periods).

        Derived from the cluster_representatives DataFrame index,
        which is the authoritative source. Note: cluster_weights may
        have more entries than actual cluster IDs due to tsam quirks.
        """
        return self.cluster_representatives.index.get_level_values(0).nunique()

    @cached_property
    def n_segments(self) -> int | None:
        """Number of segments per period if segmentation was used, else None."""
        return self.clustering.n_segments

    @cached_property
    def cluster_assignments(self) -> np.ndarray:
        """Which cluster each original period belongs to.

        Length equals the number of original periods.
        Values are cluster indices (0 to n_clusters-1).
        """
        return np.array(self.clustering.cluster_assignments)

    def __repr__(self) -> str:
        seg_info = f", n_segments={self.n_segments}" if self.n_segments else ""
        transferred_info = ", is_transferred=True" if self.is_transferred else ""
        return (
            f"AggregationResult(\n"
            f"  n_clusters={self.n_clusters},\n"
            f"  n_timesteps_per_period={self.n_timesteps_per_period}{seg_info}{transferred_info},\n"
            f"  accuracy={self.accuracy}\n"
            f")"
        )

    @cached_property
    def original(self) -> pd.DataFrame:
        """Original time series data.

        Returns
        -------
        pd.DataFrame
            The original input time series with datetime index.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> result.original.shape == df.shape
        True
        """
        return cast("pd.DataFrame", self._aggregation.timeSeries)

    @cached_property
    def reconstructed(self) -> pd.DataFrame:
        """Reconstructed time series from typical periods.

        Each original period is replaced by its assigned cluster representative.
        This is cached for performance since reconstruction can be expensive.

        Returns
        -------
        pd.DataFrame
            Reconstructed time series with same shape as original.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> result.reconstructed.shape == df.shape
        True
        """
        return cast("pd.DataFrame", self._aggregation.predictOriginalData())

    def disaggregate(self, data: pd.DataFrame) -> pd.DataFrame:
        """Expand typical-period data back to the original time series length.

        Each original period is replaced by its assigned cluster representative
        from ``data``. The result uses the original datetime index.

        Parameters
        ----------
        data : pd.DataFrame
            Typical-period data matching ``cluster_representatives``:

            - ``(cluster, timestep)`` MultiIndex for non-segmented, or
            - ``(cluster, segment, duration)`` MultiIndex for segmented.

        Returns
        -------
        pd.DataFrame
            Disaggregated data with the original datetime index.
            For segmented input, non-segment-start timesteps are NaN.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> optimized = run_optimization(result.cluster_representatives)
        >>> full_year = result.disaggregate(optimized)
        """
        expanded = self.clustering.disaggregate(data)
        # Trim to original length (last period may be padded) and restore datetime index
        expanded = expanded.iloc[: len(self.original)]
        expanded.index = self.original.index
        return cast("pd.DataFrame", expanded)

    @cached_property
    def residuals(self) -> pd.DataFrame:
        """Residuals (original - reconstructed).

        Positive values indicate the original exceeded the reconstruction.

        Returns
        -------
        pd.DataFrame
            Residual time series with same shape as original.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> result.residuals.mean()  # Should be close to zero
        """
        return cast("pd.DataFrame", self.original - self.reconstructed)

    def to_dict(self) -> dict:
        """Export results as a dictionary for serialization.

        Returns
        -------
        dict
            Dictionary containing all result data in serializable format.
        """
        return {
            "cluster_representatives": self.cluster_representatives.to_dict(),
            "cluster_assignments": self.cluster_assignments.tolist(),
            "cluster_weights": self.cluster_weights,
            "n_clusters": self.n_clusters,
            "n_timesteps_per_period": self.n_timesteps_per_period,
            "n_segments": self.n_segments,
            "segment_durations": self.segment_durations,
            "clustering": self.clustering.to_dict(),
            "accuracy": {
                "rmse": self.accuracy.rmse.to_dict(),
                "mae": self.accuracy.mae.to_dict(),
                "rmse_duration": self.accuracy.rmse_duration.to_dict(),
                "rescale_deviations": self.accuracy.rescale_deviations.to_dict(),
                "weighted_rmse": self.accuracy.weighted_rmse,
                "weighted_mae": self.accuracy.weighted_mae,
                "weighted_rmse_duration": self.accuracy.weighted_rmse_duration,
            },
            "clustering_duration": self.clustering_duration,
        }

    @property
    def timestep_index(self) -> list[int]:
        """Get the timestep or segment indices.

        Returns
        -------
        list[int]
            List of indices [0, 1, ..., n-1] where n is n_segments
            if segmentation was used, otherwise n_timesteps_per_period.
        """
        n = self.n_segments if self.n_segments else self.n_timesteps_per_period
        return list(range(n))

    @property
    def period_index(self) -> list[int]:
        """Get the period (cluster) indices.

        Returns the actual cluster IDs from the cluster_representatives
        DataFrame, which is the authoritative source.

        Returns
        -------
        list[int]
            Sorted list of cluster indices present in cluster_representatives.
        """
        return sorted(self.cluster_representatives.index.get_level_values(0).unique())

    @property
    def assignments(self) -> pd.DataFrame:
        """Get timestep-level assignment information.

        Returns a DataFrame with one row per original timestep containing
        assignment information for transferring results to another aggregation.

        Columns
        -------
        period_idx : int
            Index of the original period (0-indexed, 0 to n_original_periods-1).
        timestep_idx : int
            Timestep index within the period (0 to n_timesteps_per_period-1).
        cluster_idx : int
            Which cluster this period is assigned to (0 to n_clusters-1).
        segment_idx : int (only if segmentation was used)
            Which segment this timestep belongs to within its period.

        Returns
        -------
        pd.DataFrame
            DataFrame indexed by original time index with assignment columns.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> result.assignments.head()
                             period_idx  timestep_idx  cluster_idx
        2010-01-01 00:00:00          0             0            3
        2010-01-01 01:00:00          0             1            3
        ...

        >>> # Save and reload assignments
        >>> result.assignments.to_csv("assignments.csv")
        """
        agg = self._aggregation

        # Build period_idx and timestep_idx for each original timestep
        period_indices = []
        timestep_indices = []
        cluster_indices = []

        for orig_period_idx, cluster_idx in enumerate(self.cluster_assignments):
            for timestep in range(self.n_timesteps_per_period):
                period_indices.append(orig_period_idx)
                timestep_indices.append(timestep)
                cluster_indices.append(cluster_idx)

        result_df = pd.DataFrame(
            {
                "period_idx": period_indices,
                "timestep_idx": timestep_indices,
                "cluster_idx": cluster_indices,
            },
            index=agg.timeIndex,
        )

        # Add segment_idx if segmentation was used
        if self.n_segments is not None and hasattr(
            agg, "segmentedNormalizedTypicalPeriods"
        ):
            segment_indices = []
            for cluster_idx in self.cluster_assignments:
                # Get segment structure for this cluster's typical period
                segment_data = agg.segmentedNormalizedTypicalPeriods.loc[cluster_idx]
                # Segment Step is level 0, Segment Duration is level 1
                segment_steps = segment_data.index.get_level_values(0)
                segment_durations = segment_data.index.get_level_values(1)
                # Repeat each segment index by its duration
                segment_indices.extend(
                    np.repeat(segment_steps, segment_durations).tolist()
                )
            result_df["segment_idx"] = segment_indices

        return result_df

    @property
    def plot(self) -> ResultPlotAccessor:
        """Access plotting methods.

        Returns a plotting accessor with methods for visualizing the results.

        Returns
        -------
        ResultPlotAccessor
            Accessor with plotting methods.

        Examples
        --------
        >>> result = tsam.aggregate(df, n_clusters=8)
        >>> result.plot.compare()  # Compare original vs reconstructed
        >>> result.plot.residuals()  # View reconstruction errors
        >>> result.plot.cluster_representatives()
        >>> result.plot.cluster_members()  # All periods per cluster
        >>> result.plot.cluster_weights()
        >>> result.plot.accuracy()
        """
        from tsam.plot import ResultPlotAccessor

        return ResultPlotAccessor(self)

n_clusters `cached` `property` ¶

n_clusters: int

Number of clusters (typical periods).

Derived from the cluster_representatives DataFrame index, which is the authoritative source. Note: cluster_weights may have more entries than actual cluster IDs due to tsam quirks.

n_segments `cached` `property` ¶

n_segments: int | None

Number of segments per period if segmentation was used, else None.

cluster_assignments `cached` `property` ¶

cluster_assignments: ndarray

Which cluster each original period belongs to.

Length equals the number of original periods. Values are cluster indices (0 to n_clusters-1).

original `cached` `property` ¶

original: DataFrame

Original time series data.

Returns:

Type	Description
`DataFrame`	The original input time series with datetime index.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.original.shape == df.shape
True

reconstructed `cached` `property` ¶

reconstructed: DataFrame

Reconstructed time series from typical periods.

Each original period is replaced by its assigned cluster representative. This is cached for performance since reconstruction can be expensive.

Returns:

Type	Description
`DataFrame`	Reconstructed time series with same shape as original.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.reconstructed.shape == df.shape
True

residuals `cached` `property` ¶

residuals: DataFrame

Residuals (original - reconstructed).

Positive values indicate the original exceeded the reconstruction.

Returns:

Type	Description
`DataFrame`	Residual time series with same shape as original.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.residuals.mean()  # Should be close to zero

timestep_index `property` ¶

timestep_index: list[int]

Get the timestep or segment indices.

Returns:

Type	Description
`list[int]`	List of indices [0, 1, ..., n-1] where n is n_segments if segmentation was used, otherwise n_timesteps_per_period.

period_index `property` ¶

period_index: list[int]

Get the period (cluster) indices.

Returns the actual cluster IDs from the cluster_representatives DataFrame, which is the authoritative source.

Returns:

Type	Description
`list[int]`	Sorted list of cluster indices present in cluster_representatives.

assignments `property` ¶

assignments: DataFrame

Get timestep-level assignment information.

Returns a DataFrame with one row per original timestep containing assignment information for transferring results to another aggregation.

Columns

period_idx : int Index of the original period (0-indexed, 0 to n_original_periods-1). timestep_idx : int Timestep index within the period (0 to n_timesteps_per_period-1). cluster_idx : int Which cluster this period is assigned to (0 to n_clusters-1). segment_idx : int (only if segmentation was used) Which segment this timestep belongs to within its period.

Returns:

Type	Description
`DataFrame`	DataFrame indexed by original time index with assignment columns.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.assignments.head()
                     period_idx  timestep_idx  cluster_idx
2010-01-01 00:00:00          0             0            3
2010-01-01 01:00:00          0             1            3
...

>>> # Save and reload assignments
>>> result.assignments.to_csv("assignments.csv")

plot `property` ¶

plot: ResultPlotAccessor

Access plotting methods.

Returns a plotting accessor with methods for visualizing the results.

Returns:

Type	Description
`ResultPlotAccessor`	Accessor with plotting methods.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> result.plot.compare()  # Compare original vs reconstructed
>>> result.plot.residuals()  # View reconstruction errors
>>> result.plot.cluster_representatives()
>>> result.plot.cluster_members()  # All periods per cluster
>>> result.plot.cluster_weights()
>>> result.plot.accuracy()

disaggregate ¶

disaggregate(data: DataFrame) -> pd.DataFrame

Expand typical-period data back to the original time series length.

Each original period is replaced by its assigned cluster representative from data. The result uses the original datetime index.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	Typical-period data matching `cluster_representatives`: `(cluster, timestep)` MultiIndex for non-segmented, or `(cluster, segment, duration)` MultiIndex for segmented.	required

Returns:

Type	Description
`DataFrame`	Disaggregated data with the original datetime index. For segmented input, non-segment-start timesteps are NaN.

Examples:

>>> result = tsam.aggregate(df, n_clusters=8)
>>> optimized = run_optimization(result.cluster_representatives)
>>> full_year = result.disaggregate(optimized)

Source code in src/tsam/result.py

def disaggregate(self, data: pd.DataFrame) -> pd.DataFrame:
    """Expand typical-period data back to the original time series length.

    Each original period is replaced by its assigned cluster representative
    from ``data``. The result uses the original datetime index.

    Parameters
    ----------
    data : pd.DataFrame
        Typical-period data matching ``cluster_representatives``:

        - ``(cluster, timestep)`` MultiIndex for non-segmented, or
        - ``(cluster, segment, duration)`` MultiIndex for segmented.

    Returns
    -------
    pd.DataFrame
        Disaggregated data with the original datetime index.
        For segmented input, non-segment-start timesteps are NaN.

    Examples
    --------
    >>> result = tsam.aggregate(df, n_clusters=8)
    >>> optimized = run_optimization(result.cluster_representatives)
    >>> full_year = result.disaggregate(optimized)
    """
    expanded = self.clustering.disaggregate(data)
    # Trim to original length (last period may be padded) and restore datetime index
    expanded = expanded.iloc[: len(self.original)]
    expanded.index = self.original.index
    return cast("pd.DataFrame", expanded)

to_dict ¶

to_dict() -> dict

Export results as a dictionary for serialization.

Returns:

Type	Description
`dict`	Dictionary containing all result data in serializable format.

Source code in src/tsam/result.py

def to_dict(self) -> dict:
    """Export results as a dictionary for serialization.

    Returns
    -------
    dict
        Dictionary containing all result data in serializable format.
    """
    return {
        "cluster_representatives": self.cluster_representatives.to_dict(),
        "cluster_assignments": self.cluster_assignments.tolist(),
        "cluster_weights": self.cluster_weights,
        "n_clusters": self.n_clusters,
        "n_timesteps_per_period": self.n_timesteps_per_period,
        "n_segments": self.n_segments,
        "segment_durations": self.segment_durations,
        "clustering": self.clustering.to_dict(),
        "accuracy": {
            "rmse": self.accuracy.rmse.to_dict(),
            "mae": self.accuracy.mae.to_dict(),
            "rmse_duration": self.accuracy.rmse_duration.to_dict(),
            "rescale_deviations": self.accuracy.rescale_deviations.to_dict(),
            "weighted_rmse": self.accuracy.weighted_rmse,
            "weighted_mae": self.accuracy.weighted_mae,
            "weighted_rmse_duration": self.accuracy.weighted_rmse_duration,
        },
        "clustering_duration": self.clustering_duration,
    }

tsam.result¶

tsam.result ¶

AccuracyMetrics dataclass ¶

summary property ¶

AggregationResult dataclass ¶

n_clusters cached property ¶

n_segments cached property ¶

cluster_assignments cached property ¶

original cached property ¶

reconstructed cached property ¶

residuals cached property ¶

timestep_index property ¶

period_index property ¶

assignments property ¶

plot property ¶

disaggregate ¶

to_dict ¶

AccuracyMetrics `dataclass` ¶

summary `property` ¶

AggregationResult `dataclass` ¶

n_clusters `cached` `property` ¶

n_segments `cached` `property` ¶

cluster_assignments `cached` `property` ¶

original `cached` `property` ¶

reconstructed `cached` `property` ¶

residuals `cached` `property` ¶

timestep_index `property` ¶

period_index `property` ¶

assignments `property` ¶

plot `property` ¶