Correlation Clustering Selector¶

The correlation clustering selector aims to produce a set of low correlated assets. The general idea is to partition the original universe of assets in clusters of highly correlated elements. Then for each cluster of 2 or more assets a representative is selected according to a performance measure. At the end, the size of the selection is the number of clusters.

The CorrClusterSelctor uses a hierarchical clustering algorithm with Ward linkage and correlation distance, \(d_\rho(A, B) = 1 - \rho(A, B)\), between assets. The hierarchical tree is cut at \(1 - \rho_{\rm th}\), where \(\rho_{\rm th}\) is a user defined correlation threshold. A typical value is \(\rho_{\rm th}=0.95\). It uses the f13612w filter to define the best representative of each cluster. The f13612w filter is a momentum measure defined as the weighted average of the most recent annualized 1-, 3-, 6-, and 12-months rates of return. The typical setup is equal weighted average. However, azapy implementation allows for arbitrary positive weights (not all zero), e.g., [1, 2, 1, 1].

TOP

CorrClusterSelector class¶

class azapy.Selectors.CorrClusterSelector.CorrClusterSelector(pname='CorrCluster', corr_threshold=0.95, freq='Q', ftype='f13612w', fw=None, col_price='adjusted', hlength=1)¶

Bases: NullSelector

Selects symbols with lower inter-correlation.

Attributes

pname : str - portfolio name
mkt : pandas.DataFrame - selection’s market data
symb : list - selected symbols
symb_omitted : list - unselected symbols
capital : float - always set to 1

Methods

getSelection(mktdata, **params)

Computes the selection.

__init__(pname='CorrCluster', corr_threshold=0.95, freq='Q', ftype='f13612w', fw=None, col_price='adjusted', hlength=1)¶

Constructor

Parameters:

pnamestr, optional: Selector name. The default is ‘DualMomentum’.
corr_thresholdfloat, optional: Cluster correlation threshold (i.e., a cluster contains only symbols with inter-correlation higher than corr_threshold. The default is 0.95.
freqstr, optional: The horizon of rates subject to correlation estimations. It can be either ‘M’ for monthly or ‘Q’ for quarterly rates. The default is ‘Q’.
ftypestr, optional: Inner-cluster filter (i.e., criteria to designate the representative of a cluster with more than one symbol). At this point only ‘f13612w’ is implemented. The default is ‘f13612w’.
fwlist, optional: List of filter wights. For ‘f13612w’ it must be a list of 4 positive (not all zero) numbers. A value of None indicates equal weights. Note: the weights are normalized internally. The default is None.
col_pricestr, optional: The name of the pricing column to be considered in computations. The default is ‘adjusted’.
hlength‘float’, optional: History length in number of years used for calibration. A fractional number will be rounded to an integer number of months. The default is 1 years.

Returns:

The object.

getSelection(mktdata, **params)¶

Computes the selection.

Parameters:

mktdatapandas.DataFrame

MkT data in the format produced by the azapy function readMkT.

**paramsdict, optional

Other optional parameters:

verboseBoolean, optional: When it is set to True, the selection symbols are printed. The default is ‘False’.
viewBoolean, optional: If set to True, then the dendrogram of hierarchical classification is printed out. The default is False. Note: the tree cutoff is at 1 - corr_threshold level.

Returns:

(capital, mkt)tuple

capitalfloat: Fraction of capital allocated to the selection. For this selector it is always 1.
mktpandas.DataFrame: Selection MkT data in the format produced by the azapy function readMkT.

TOP

Example CorrClusterSelctor ¶

# Examples
import numpy as np

import azapy as az
print(f"azapy version {az.version()}", flush=True)

#==============================================================================
# collect market data
mktdir = '../../MkTdata'
sdate = '2012-01-01'
edate = '2021-07-27'

symb = ['GLD', 'TLT', 'IHI', 'SPY', 'OIH',
        'XAR', 'XBI', 'XHE', 'XHS', 'XLB',
        'XLE', 'XLF', 'XLI', 'XLK', 'XLU', 
        'XLV', 'XLY', 'XRT', 'SPY', 'ONEQ', 
        'QQQ', 'DIA', 'ILF', 'XSW', 'PGF', 
        'IDV', 'JNK', 'HYG', 'SDIV', 'VIG', 
        'SLV', 'AAPL', 'MSFT', 'AMZN', 'GOOG', 
        'IYT', 'VIG', 'IWM', 'BRK-B', 'ITA' ]

mktdata = az.readMkT(symb, sdate=sdate, edate=edate, file_dir=mktdir, 
                     verbose=False)

#==============================================================================
# CorrClusterSelector

selector = az.CorrClusterSelector()

capital, mkt = selector.getSelection(mktdata)

print(f"As of {edate}\n"
      f"capital at risk: {capital}\n"
      f"selected symbols: {mkt.symbol.unique()}\n"
      f"selected {len(mkt.symbol.unique())} out of {len(symb)} symbols\n"
      f"symbols omitted: {list(np.setdiff1d(symb, mkt.symbol.unique()))}")

TOP

Correlation Clustering Selector¶

CorrClusterSelector class¶

Example CorrClusterSelctor ¶

Table of Contents

Previous topic

Next topic

This Page

Correlation Clustering Selector¶

CorrClusterSelector class¶

Example CorrClusterSelctor¶

Example CorrClusterSelctor ¶