Time Series Aggregation Class#

Descriptions of the basic functions are given below.

Function descriptions:

class timeseriesaggregation.TimeSeriesAggregation(timeSeries, resolution=None, noTypicalPeriods=10, noSegments=10, hoursPerPeriod=24, clusterMethod='hierarchical', evalSumPeriods=False, sortValues=False, sameMean=False, rescaleClusterPeriods=True, weightDict=None, segmentation=False, extremePeriodMethod='None', representationMethod=None, representationDict=None, distributionPeriodWise=True, segmentRepresentationMethod=None, predefClusterOrder=None, predefClusterCenterIndices=None, solver='highs', roundOutput=None, addPeakMin=None, addPeakMax=None, addMeanMin=None, addMeanMax=None)[source]#

Clusters time series data to typical periods.

__init__(timeSeries, resolution=None, noTypicalPeriods=10, noSegments=10, hoursPerPeriod=24, clusterMethod='hierarchical', evalSumPeriods=False, sortValues=False, sameMean=False, rescaleClusterPeriods=True, weightDict=None, segmentation=False, extremePeriodMethod='None', representationMethod=None, representationDict=None, distributionPeriodWise=True, segmentRepresentationMethod=None, predefClusterOrder=None, predefClusterCenterIndices=None, solver='highs', roundOutput=None, addPeakMin=None, addPeakMax=None, addMeanMin=None, addMeanMax=None)[source]#

Initialize the periodly clusters.

Parameters:
  • timeSeries (pandas.DataFrame() or dict) – DataFrame with the datetime as index and the relevant time series parameters as columns. required

  • resolution (float) – Resolution of the time series in hours [h]. If timeSeries is a pandas.DataFrame() the resolution is derived from the datetime index. optional, default: delta_T in timeSeries

  • hoursPerPeriod (integer) – Value which defines the length of a cluster period. optional, default: 24

  • noTypicalPeriods (integer) – Number of typical Periods - equivalent to the number of clusters. optional, default: 10

  • noSegments (integer) – Number of segments in which the typical periods shoul be subdivided - equivalent to the number of inner-period clusters. optional, default: 10

  • clusterMethod (string) –

    Chosen clustering method. optional, default: ‘hierarchical’
    Options are:

    • ’averaging’

    • ’k_means’

    • ’k_medoids’

    • ’k_maxoids’

    • ’hierarchical’

    • ’adjacent_periods’

  • evalSumPeriods (boolean) – Boolean if in the clustering process also the averaged periodly values shall be integrated additional to the periodly profiles as parameters. optional, default: False

  • sameMean (boolean) – Boolean which is used in the normalization procedure. If true, all time series get normalized such that they have the same mean value. optional, default: False

  • sortValues (boolean) – Boolean if the clustering should be done by the periodly duration curves (true) or the original shape of the data. optional (default: False)

  • rescaleClusterPeriods (boolean) – Decides if the cluster Periods shall get rescaled such that their weighted mean value fits the mean value of the original time series. optional (default: True)

  • weightDict (dict) – Dictionary which weights the profiles. It is done by scaling the time series while the normalization process. Normally all time series have a scale from 0 to 1. By scaling them, the values get different distances to each other and with this, they are differently evaluated while the clustering process. optional (default: None )

  • extremePeriodMethod (string) –

    Method how to integrate extreme Periods (peak demand, lowest temperature etc.) into to the typical period profiles. optional, default: ‘None’
    Options are:

    • None: No integration at all.

    • ’append’: append typical Periods to cluster centers

    • ’new_cluster_center’: add the extreme period as additional cluster center. It is checked then for all Periods if they fit better to the this new center or their original cluster center.

    • ’replace_cluster_center’: replaces the cluster center of the cluster where the extreme period belongs to with the periodly profile of the extreme period. (Worst case system design)

  • representationMethod (string) –

    Chosen representation. If specified, the clusters are represented in the chosen way. Otherwise, each clusterMethod has its own commonly used default representation method.
    Options are:

    • ’meanRepresentation’ (default of ‘averaging’ and ‘k_means’)

    • ’medoidRepresentation’ (default of ‘k_medoids’, ‘hierarchical’ and ‘adjacent_periods’)

    • ’minmaxmeanRepresentation’

    • ’durationRepresentation’/ ‘distributionRepresentation’

    • ’distribtionAndMinMaxRepresentation’

  • representationDict (dict) – Dictionary which states for each attribute whether the profiles in each cluster should be represented by the minimum value or maximum value of each time step. This enables estimations to the safe side. This dictionary is needed when ‘minmaxmeanRepresentation’ is chosen. If not specified, the dictionary is set to containing ‘mean’ values only.

  • distributionPeriodWise – If durationRepresentation is chosen, you can choose whether the distribution of each cluster should be separately preserved or that of the original time series only (default: True)

  • segmentRepresentationMethod (string) –

    Chosen representation for the segments. If specified, the segments are represented in the chosen way. Otherwise, it is inherited from the representationMethod.
    Options are:

    • ’meanRepresentation’ (default of ‘averaging’ and ‘k_means’)

    • ’medoidRepresentation’ (default of ‘k_medoids’, ‘hierarchical’ and ‘adjacent_periods’)

    • ’minmaxmeanRepresentation’

    • ’durationRepresentation’/ ‘distributionRepresentation’

    • ’distribtionAndMinMaxRepresentation’

  • predefClusterOrder (list or array) – Instead of aggregating a time series, a predefined grouping is taken which is given by this list. optional (default: None)

  • predefClusterCenterIndices (list or array) – If predefClusterOrder is give, this list can define the representative cluster candidates. Otherwise the medoid is taken. optional (default: None)

  • solver (string) – Solver that is used for k_medoids clustering. optional (default: ‘cbc’ )

  • roundOutput (integer) – Decimals to what the output time series get round. optional (default: None )

  • addPeakMin (list) – List of column names which’s minimal value shall be added to the typical periods. E.g.: [‘Temperature’]. optional, default: []

  • addPeakMax (list) – List of column names which’s maximal value shall be added to the typical periods. E.g. [‘EDemand’, ‘HDemand’]. optional, default: []

  • addMeanMin (list) – List of column names where the period with the cumulative minimal value shall be added to the typical periods. E.g. [‘Photovoltaic’]. optional, default: []

  • addMeanMax (list) – List of column names where the period with the cumulative maximal value shall be added to the typical periods. optional, default: []

createTypicalPeriods()[source]#

Clusters the Periods.

Returns:

self.typicalPeriods – All typical Periods in scaled form.

prepareEnersysInput()[source]#

Creates all dictionaries and lists which are required for the energy system optimization input.

property stepIdx#

Index inside a single cluster

property clusterPeriodIdx#

Index of the clustered periods

property clusterOrder#

The sequence/order of the typical period to represent the original time series

property clusterPeriodNoOccur#

How often does a typical period occur in the original time series

property clusterPeriodDict#

Time series data for each period index as dictionary

property segmentDurationDict#

Segment duration in time steps for each period index as dictionary

predictOriginalData()[source]#

Predicts the overall time series if every period would be placed in the related cluster center

Returns:

predictedData (pandas.DataFrame) – DataFrame which has the same shape as the original one.

indexMatching()[source]#

Relates the index of the original time series with the indices represented by the clusters

Returns:

timeStepMatching (pandas.DataFrame) – DataFrame which has the same shape as the original one.

accuracyIndicators()[source]#

Compares the predicted data with the original time series.

Returns:

pd.DataFrame(indicatorRaw) (pandas.DataFrame) – Dataframe containing indicators evaluating the accuracy of the aggregation

totalAccuracyIndicators()[source]#

Derives the accuracy indicators over all time series

timeseriesaggregation.unstackToPeriods(timeSeries, timeStepsPerPeriod)[source]#

Extend the timeseries to an integer multiple of the period length and groups the time series to the periods.

Parameters:
  • timeSeries (pandas DataFrame) –

  • timeStepsPerPeriod (integer) – The number of discrete timesteps which describe one period. required

Returns:

  • unstackedTimeSeries (pandas DataFrame) – is stacked such that each row represents a candidate period

  • timeIndex (pandas Series index) – is the modification of the original timeseriesindex in case an integer multiple was created