azapy.MkT package

Submodules

azapy.MkT.MkTcalendar module

azapy.MkT.MkTcalendar.NYSEgen(sdate='1980-01-01', edate='2050-12-31')

Returns the NYSE business calendar between sdate and edate.

To be deprecated in future versions. Instead, please use calendarGen(name=”NYSE”, sdate=sdate, edate=edate).

Parameters:
sdatedate like, optional

Calendar start date. The default is ‘1980-01-01’.

edatedate like, optional

Calendar end date. The default is ‘2050-12-31’.

Returns:
`numpy.busdaycalendar`NYSE business calendar.
azapy.MkT.MkTcalendar.calendarGen(name='NYSE', sdate='1980-01-01', edate='2050-12-31')

Returns exchange business calendar

Parameters:
namestr, optional

The exchange name. A valid exchange name is listed

get_calendar_names()

Default value is ‘NYSE’ - i.e. New York Stock Exchange.

sdatedate like, optional

Calendar start date. The default is ‘1980-01-01’.

edatedate like, optional

Calendar end date. The default is ‘2050-12-31’.

Returns:
`numpy.busdaycalendar`NYSE business calendar.
azapy.MkT.MkTcalendar.get_calendar_names()

Returns calendar exchange names.

azapy.MkT.MkTreader module

class azapy.MkT.MkTreader.MkTreader(verbose=True)

Bases: object

Collects historical market prices from market data providers such as ‘yahoo’, ‘eodhistoricaldata’, ‘alphavantage’ and ‘marketstack’.

Attributs
  • dsource : dict of request instructions per symbol

  • delta_time : execution time of the request in seconds

  • rout : pandas.DataFrame containing historical prices for all symbols. It is created during the call of get function.

  • rout_status : request status information. It is created during the call of get_request_status function or during the call of function get with option verbose=True.

  • error_log : contains lists of missing historical observation dates. It is created together with rout_status.

Methods

get([symbol, sdate, edate, calendar, ...])

Retrieves market data for a set of stock symbols.

get_error_log()

Returns lists of missing historical observation dates per symbol

get_request_status([verbose])

Reports abbreviated information about request status.

set_imputation([method])

Historical market data imputation, i.e., filling missing values according to the imputation method.

__init__(verbose=True)

Constructor

Parameters:
verboseBoolean, optional

If set to True, additional information will be printed during the loading of historical prices. The default value is True.

Returns:
The MkTreder object
get(symbol=[], sdate='2012-01-01', edate='today', calendar=None, output_format='frame', source=None, force=False, save=True, file_dir='outDir', file_format='csv', api_key=None, param=None, verbose=None)

Retrieves market data for a set of stock symbols.

Parameters:
symbolstr or list of str, optional

Stock symbols to be uploaded. The default is [].

sdatedate like, optional

The start date of historical time series. The default is “2012-01-01”.

edatedate like, optional

The end date of historical time series (must: sdate >= edate) The default is ‘today’.

calendarstr or numpy.busdaycalendar, optional

Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.

output_formatstr, optional
The function output format. It can be:
  • ‘frame’ - pandas.DataFrame

  • ‘dict’ - dict of pandas.DataFrame. The symbols are the keys.

The default is ‘frame’

sourcestr or dict, optional

If it is a str, then it represents the market data provider for all historical prices request. Possible values are: ‘yahoo’, ‘alphavantage’, ‘alphavantage_yahoo’, ‘eodhistoricaldata’, ‘eodhistoricaldata_yahoo’ and ‘marketstack’. If set to None it will default to ‘yahoo’.

It can be set to a dict containing specific instructions for each stock symbol. The dict keys are the symbols and the values are ‘dict’ instructions specific to each symbol. Valid keys for the instructions dict are the names of this function call variables except ‘sdate’, ‘edate’, ‘calendar’ and ‘output_format’. The actual set of stock symbols is given by the union of variable ‘symbol’ and the keys of the dict ‘source’. Missing values in the symbol instruction dict’s will be filled with the values of the function call variables. The values of the function call variables act as generic values to be used in absence of specific instructions in the ‘source’ dict. The default is None.

Example of dict ‘source’:

source = {‘AAPL’: {‘source’: ‘eodhistoricaldata, ‘verbose’: True}, ‘SPY’: {‘source’: ‘yahoo’, ‘force’: True}}

In this case there are 2 symbols that will be added (union) to the set of symbols defined by ‘symbol’ variable. For symbol ‘AAPL’ the provider source is eodhistoricaldata and the ‘verbose’ instruction is set to True. The rest of the instructions: ‘force’, ‘save’, ‘file_dir’, ‘file_format’, ‘api_key’ and ‘param’ are set to the values of the corresponding function call variables. Similar for symbol ‘SPY’. The instructions for the rest of the symbols that may be specified in the ‘symbol’ variable will be set according to the values of the function call variables.

forceBoolean, optional
  • True: will try to collect historical prices exclusive from the market data providers.

  • False: first it will try to load the historical prices from a local saved file. If such a file does not exist the market data provider will be accessed.

If the file exists but the saved historical data is too short then it will try to collect the missing values only from the market data provider. The default is False.

saveBoolean, optional
  • True: It will try to save the historical price collected from the providers to a local file.

  • False: No attempt to save the data is made.

The default is True.

file_dirstr, optional

Directory with (to save) historical market data. If it does not exists then it will be created. The default is “outDir”.

file_formatstr, optional

The saved file format for the historical prices. The following files formats are supported: csv, json and feather The default is ‘csv’.

api_keystr, optional

Provider API key (where is required). If set to None then the API key is set to the value of global environment variables

  • APLPHAVANTAGE_API_KEY for alphavantage,

  • EODHISTORICALDATA_API_KEY for eodhistoricaldata,

  • MARKETSTACK_API_KEY for marketstack.

The default is None.

paramdict, optional

Set of additional information to access the market data provider. At this point in time only accessing alphavantage provider requires an additional parameter specifying the maximum number of API (symbols) requested per minute. It varies with the level of access corresponding to the API key. The minimum value is 5 for a free key and starts at 75 for premium keys. This value is stored in max_req_per_min variable.

Example: param = {‘max_req_per_min’: 5}

This is also the default vale for alphavantage, if param is set to None. The default is None.

verboseBoolean, optional

If set True, the additional information will be printed during the loading of historical prices. If None it is ignored, otherwise it overwrites the value set by the constructor. The default value is None.

Returns:
`pandas.DataFrame` or ‘dict’ of `pandas.DataFrame`Historical market data.

The output format is designated by the value of the input parameter output_format.

get_error_log()

Returns lists of missing historical observation dates per symbol

Returns:
`dict`The error-log.
If it is an empty dict then there are no missing dates in the
collected historical time series.
Otherwise, the keys of the dict are the symbols that have missing
dates. The values for these keys are also dict with the following
fields:
  • ‘back’: a list of missing date at the tail of the time series

  • ‘front’ : a list of missing data at the head of the time series

  • ‘mid’ : a list of missing data in the middle of the time series

Fields with empty list of dates are omitted.
get_request_status(verbose=None)

Reports abbreviated information about request status.

verboseBoolean, optional

If set to True, additional information will be printed during the function execution. In set to ‘None’, it will be ignored, otherwise it will overwrite the value set by the constructor. The default value is None.

Returns:
`pandas.DataFrame`The status report.
The column names are the symbols for which the data was requested.
The rows contain the actual input parameters per symbol as well
as:
  • ‘nrow’ : the length of historical time series.

  • ‘sdate’ : first date in the time series.

  • ‘edate’ : end date of the time series.

  • ‘error’ : if there are missing data. If its value is ‘Yes’ then the actual list of missing date per symbol can be obtained by calling get_error_log.

set_imputation(method='linear')

Historical market data imputation, i.e., filling missing values according to the imputation method. The missing data at the beginning or the end of the time series remains unchanged.

Please use with cautions, for cases where small amount of data needs to be filled in. Any change of the market data will introduce a bias. If large amount of data is missing it is advisable to get a different source for historical market data.

The function will return the new corrected market data without altering the object state (the raw market data is preserved).

Returns:
pandas.DataFrame or dict of pandas.DataFrame, as it was set in the constructor by the input parameter

output_format.

azapy.MkT.readMkT module

azapy.MkT.readMkT.readMkT(symbol=[], sdate='2012-01-01', edate='today', calendar=None, output_format='frame', source=None, force=False, save=True, file_dir='outDir', file_format='csv', api_key=None, param=None, imputation=None, verbose=True)

Retrieves market data for a set of stock symbols.

It is a wrapper for MkTreader class returning directly the requested historical time series. The function call variables are the same as for ‘MkTreader’ member function ‘get’.

Parameters:
symbolstr or list of str, optional

Stock symbols to be uploaded. The default is [].

sdatedate like, optional

The start date of historical time series. The default is “2012-01-01”.

edatedate like, optional

The end date of historical time series (must: sdate >= edate) The default is ‘today’.

calendarstr or numpy.busdaycalendar, optional

Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.

output_formatstr, optional
The function output format. It can be:
  • ‘frame’ - pandas.DataFrame

  • ‘dict’ - dict of pandaws.DataFrame. The symbols are the keys.

The default is ‘frame’

sourcestr or dict, optional

If it is a str, then it represents the market data provider for all historical prices request. Possible values are: ‘yahoo’, ‘alphavantage’, ‘alphavantage_yahoo’, ‘eodhistoricaldata’, ‘eodhistoricaldata_yahoo’ and ‘marketstack’. If set to None it will default to ‘yahoo’.

It can be set to a dict containing specific instructions for each stock symbol. The dict keys are the symbols and the values are ‘dict’ instructions specific to each symbol. Valid keys for the instructions dict are the names of this function call variables except ‘sdate’, ‘edate’, ‘calendar’ and ‘output_format’. The actual set of stock symbols is given by the union of variable ‘symbol’ and the keys of the dict ‘source’. Missing values in the symbol instruction dict’s will be filled with the values of the function call variables. The values of the function call variables act as generic values to be used in absence of specific instructions in the ‘source’ dict. The default is None.

Example of dict ‘source’:

source = {‘AAPL’: {‘source’: ‘eodhistoricaldata, ‘verbose’: True}, ‘SPY’: {‘source’: ‘yahoo’, ‘force’: True}}

In this case there are 2 symbols that will be added (union) to the set of symbols defined by ‘symbol’ variable. For symbol ‘AAPL’ the provider source is eodhistoricaldata and the ‘verbose’ instruction is set to True. The rest of the instructions: ‘force’, ‘save’, ‘file_dir’, ‘file_format’, ‘api_key’ and ‘param’ are set to the values of the corresponding function call variables. Similar for symbol ‘SPY’. The instructions for the rest of the symbols that may be specified in the ‘symbol’ variable will be set according to the values of the function call variables.

forceBoolean, optional
  • True: will try to collect historical prices exclusive from the market data providers.

  • False: first it will try to load the historical prices from a local saved file. If such a file does not exist the market data provider will be accessed.

If the file exists but the saved historical data is too short then it will try to collect the missing values only from the market data provider. The default is False.

saveBoolean, optional
  • True: It will try to save the historical price collected from the providers to a local file.

  • False: No attempt to save the data is made.

The default is True.

file_dirstr, optional

Directory with (to save) historical market data. If it does not exists then it will be created. The default is “outDir”.

file_formatstr, optional

The saved file format for the historical prices. The following files formats are supported: csv, json and feather The default is ‘csv’.

api_keystr, optional

Provider API key (where is required). If set to None then the API key is set to the value of global environment variables

  • APLPHAVANTAGE_API_KEY for alphavantage,

  • EODHISTORICALDATA_API_KEY for eodhistoricaldata,

  • MARKETSTACK_API_KEY for marketstack.

The default is None.

paramdict, optional

Set of additional information to access the market data provider. At this point in time only accessing alphavantage provider requires an additional parameter specifying the maximum number of API (symbols) requested per minute. It varies with the level of access corresponding to the API key. The minimum value is 5 for a free key and starts at 75 for premium keys. This value is stored in max_req_per_min variable.

Example: param = {‘max_req_per_min’: 5}

This is also the default vale for alphavantage, if param is set to None. The default is None.

imputationstr, optional
Method to fill missing data. Valid values are,
  • “linear” - filling with linearly interpolated values. The missing data at the ends of the time-series are not modified.

  • None - no imputation. However, missing data may halt further computations.

The default is None.

Note: It is recommended to call the function without imputation set and to analyze the quality of the raw data. Apply an imputation algorithm only if the amount of missing data is small and in non critical areas of the time series. In general, any imputation methodology will introduce bias in the final evaluations.

verboseBoolean, optional

If set to True, then additional information will be printed during the loading of historical prices, prior to imputation. The default is True. Note: The quality of the market data after an imputation may be asses using azapy function summary_MkTdata.

Returns:
`pandas.DataFrame` or ‘dict’ `pandas.DataFrame`Historical market data.

The output format is designated by the value of the input parameter output_format.

azapy.MkT.update_MkTdata module

azapy.MkT.update_MkTdata.update_MkTdata(mktdir, source=None, api_key=None, param=None, except_file=[], verbose=True)

Updates all mkt data saved in a directory.

Parameters:
mktdirstr

Mkt data directory.

sourcestr, optional

Mkt data provider. For more details see the azapy.MkTreader.get function doc. The default is None.

api_keystr, optional

Mkt data provider API key. For more details see the azapy.MkTreader.get function doc. The default is None

paramdict, optional

Additional parameters required by mkt data provider. For more details see the azapy.MkTreader.get function doc. The default is None.

except_filelist, optional

List of symbols to be omitted from the update. The default is [].

verboseBoolean, optional
  • True will print a progress report,

  • False suppress any printing to the terminal.

The default is True.

Returns:
intError code
  • 200 : successful, everything updated

  • 201 : some (or all) were not completely updated

  • 101 : the mktdir does not exists

  • 102 : unsupported mkt data sources

Notes

Files with unsupported extensions (see azapy.MkTreader.get function) are silently omitted from the update.

azapy.MkT.summary_MkTdata module

azapy.MkT.summary_MkTdata.summary_MkTdata(mktdata, calendar=None, sdate=None, edate=None)

Summary of MkT data time-series length and quality (checks for missing records).

Parameters:
mktdatapandas.DataFrame or a dict of pandas.DataFrame

Market Data in the format returned by azapy.readMkT function.

calendarstr or numpy.busdaycalendar, optional

Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.

sdatedate like, optional

Time-series start date. If it is None then sdate will be set to the earliest date in mktdata. The default is None.

edatedate like, optional

Time-series end date. If it is None then edate will be set to the most recent date in mktdata. The default is None.

Returns:
`pandas.DataFrame`A table with columns:
  • symbol : time-series symbol

  • begin : start date

  • end : end date

  • length : number of records

  • na_total : total number of nan

  • na_b : number of missing records at the beginning

  • na_e : number of missing records at the end

  • cont : total number of missing records

Notes

Its main application is to assess the missing data in the time-series extracted with azapy.readMkT function.

Module contents