azapy.MkT package¶
Submodules¶
azapy.MkT.MkTcalendar module¶
- azapy.MkT.MkTcalendar.NYSEgen(sdate='1980-01-01', edate='2050-12-31')
Returns the NYSE business calendar between sdate and edate.
To be deprecated in future versions. Instead, please use calendarGen(name=”NYSE”, sdate=sdate, edate=edate).
- Parameters:
- sdatedate like, optional
Calendar start date. The default is ‘1980-01-01’.
- edatedate like, optional
Calendar end date. The default is ‘2050-12-31’.
- Returns:
- `numpy.busdaycalendar`NYSE business calendar.
- azapy.MkT.MkTcalendar.calendarGen(name='NYSE', sdate='1980-01-01', edate='2050-12-31')
Returns exchange business calendar
- Parameters:
- namestr, optional
The exchange name. A valid exchange name is listed
get_calendar_names()
Default value is ‘NYSE’ - i.e. New York Stock Exchange.
- sdatedate like, optional
Calendar start date. The default is ‘1980-01-01’.
- edatedate like, optional
Calendar end date. The default is ‘2050-12-31’.
- Returns:
- `numpy.busdaycalendar`NYSE business calendar.
- azapy.MkT.MkTcalendar.get_calendar_names()
Returns calendar exchange names.
azapy.MkT.MkTreader module¶
- class azapy.MkT.MkTreader.MkTreader(verbose=True)
Bases:
object
Collects historical market prices from market data providers such as ‘yahoo’, ‘eodhistoricaldata’, ‘alphavantage’ and ‘marketstack’.
- Attributs
dsource : dict of request instructions per symbol
delta_time : execution time of the request in seconds
rout : pandas.DataFrame containing historical prices for all symbols. It is created during the call of get function.
rout_status : request status information. It is created during the call of get_request_status function or during the call of function get with option verbose=True.
error_log : contains lists of missing historical observation dates. It is created together with rout_status.
Methods
get
([symbol, sdate, edate, calendar, ...])Retrieves market data for a set of stock symbols.
Returns lists of missing historical observation dates per symbol
get_request_status
([verbose])Reports abbreviated information about request status.
set_imputation
([method])Historical market data imputation, i.e., filling missing values according to the imputation method.
- __init__(verbose=True)
Constructor
- Parameters:
- verboseBoolean, optional
If set to True, additional information will be printed during the loading of historical prices. The default value is True.
- Returns:
- The MkTreder object
- get(symbol=[], sdate='2012-01-01', edate='today', calendar=None, output_format='frame', source=None, force=False, save=True, file_dir='outDir', file_format='csv', api_key=None, param=None, verbose=None)
Retrieves market data for a set of stock symbols.
- Parameters:
- symbolstr or list of str, optional
Stock symbols to be uploaded. The default is [].
- sdatedate like, optional
The start date of historical time series. The default is “2012-01-01”.
- edatedate like, optional
The end date of historical time series (must: sdate >= edate) The default is ‘today’.
- calendarstr or numpy.busdaycalendar, optional
Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.
- output_formatstr, optional
- The function output format. It can be:
‘frame’ - pandas.DataFrame
‘dict’ - dict of pandas.DataFrame. The symbols are the keys.
The default is ‘frame’
- sourcestr or dict, optional
If it is a str, then it represents the market data provider for all historical prices request. Possible values are: ‘yahoo’, ‘alphavantage’, ‘alphavantage_yahoo’, ‘eodhistoricaldata’, ‘eodhistoricaldata_yahoo’ and ‘marketstack’. If set to None it will default to ‘yahoo’.
It can be set to a dict containing specific instructions for each stock symbol. The dict keys are the symbols and the values are ‘dict’ instructions specific to each symbol. Valid keys for the instructions dict are the names of this function call variables except ‘sdate’, ‘edate’, ‘calendar’ and ‘output_format’. The actual set of stock symbols is given by the union of variable ‘symbol’ and the keys of the dict ‘source’. Missing values in the symbol instruction dict’s will be filled with the values of the function call variables. The values of the function call variables act as generic values to be used in absence of specific instructions in the ‘source’ dict. The default is None.
Example of dict ‘source’:
source = {‘AAPL’: {‘source’: ‘eodhistoricaldata, ‘verbose’: True}, ‘SPY’: {‘source’: ‘yahoo’, ‘force’: True}}
In this case there are 2 symbols that will be added (union) to the set of symbols defined by ‘symbol’ variable. For symbol ‘AAPL’ the provider source is eodhistoricaldata and the ‘verbose’ instruction is set to True. The rest of the instructions: ‘force’, ‘save’, ‘file_dir’, ‘file_format’, ‘api_key’ and ‘param’ are set to the values of the corresponding function call variables. Similar for symbol ‘SPY’. The instructions for the rest of the symbols that may be specified in the ‘symbol’ variable will be set according to the values of the function call variables.
- forceBoolean, optional
True: will try to collect historical prices exclusive from the market data providers.
False: first it will try to load the historical prices from a local saved file. If such a file does not exist the market data provider will be accessed.
If the file exists but the saved historical data is too short then it will try to collect the missing values only from the market data provider. The default is False.
- saveBoolean, optional
True: It will try to save the historical price collected from the providers to a local file.
False: No attempt to save the data is made.
The default is True.
- file_dirstr, optional
Directory with (to save) historical market data. If it does not exists then it will be created. The default is “outDir”.
- file_formatstr, optional
The saved file format for the historical prices. The following files formats are supported: csv, json and feather The default is ‘csv’.
- api_keystr, optional
Provider API key (where is required). If set to None then the API key is set to the value of global environment variables
APLPHAVANTAGE_API_KEY for alphavantage,
EODHISTORICALDATA_API_KEY for eodhistoricaldata,
MARKETSTACK_API_KEY for marketstack.
The default is None.
- paramdict, optional
Set of additional information to access the market data provider. At this point in time only accessing alphavantage provider requires an additional parameter specifying the maximum number of API (symbols) requested per minute. It varies with the level of access corresponding to the API key. The minimum value is 5 for a free key and starts at 75 for premium keys. This value is stored in max_req_per_min variable.
Example: param = {‘max_req_per_min’: 5}
This is also the default vale for alphavantage, if param is set to None. The default is None.
- verboseBoolean, optional
If set True, the additional information will be printed during the loading of historical prices. If None it is ignored, otherwise it overwrites the value set by the constructor. The default value is None.
- Returns:
- `pandas.DataFrame` or ‘dict’ of `pandas.DataFrame`Historical market data.
The output format is designated by the value of the input parameter output_format.
- get_error_log()
Returns lists of missing historical observation dates per symbol
- Returns:
- `dict`The error-log.
- If it is an empty dict then there are no missing dates in the
- collected historical time series.
- Otherwise, the keys of the dict are the symbols that have missing
- dates. The values for these keys are also dict with the following
- fields:
‘back’: a list of missing date at the tail of the time series
‘front’ : a list of missing data at the head of the time series
‘mid’ : a list of missing data in the middle of the time series
- Fields with empty list of dates are omitted.
- get_request_status(verbose=None)
Reports abbreviated information about request status.
- verboseBoolean, optional
If set to True, additional information will be printed during the function execution. In set to ‘None’, it will be ignored, otherwise it will overwrite the value set by the constructor. The default value is None.
- Returns:
- `pandas.DataFrame`The status report.
- The column names are the symbols for which the data was requested.
- The rows contain the actual input parameters per symbol as well
- as:
‘nrow’ : the length of historical time series.
‘sdate’ : first date in the time series.
‘edate’ : end date of the time series.
‘error’ : if there are missing data. If its value is ‘Yes’ then the actual list of missing date per symbol can be obtained by calling get_error_log.
- set_imputation(method='linear')
Historical market data imputation, i.e., filling missing values according to the imputation method. The missing data at the beginning or the end of the time series remains unchanged.
Please use with cautions, for cases where small amount of data needs to be filled in. Any change of the market data will introduce a bias. If large amount of data is missing it is advisable to get a different source for historical market data.
The function will return the new corrected market data without altering the object state (the raw market data is preserved).
- Returns:
- pandas.DataFrame or dict of pandas.DataFrame, as it was set in the constructor by the input parameter
output_format.
azapy.MkT.readMkT module¶
- azapy.MkT.readMkT.readMkT(symbol=[], sdate='2012-01-01', edate='today', calendar=None, output_format='frame', source=None, force=False, save=True, file_dir='outDir', file_format='csv', api_key=None, param=None, imputation=None, verbose=True)
Retrieves market data for a set of stock symbols.
It is a wrapper for MkTreader class returning directly the requested historical time series. The function call variables are the same as for ‘MkTreader’ member function ‘get’.
- Parameters:
- symbolstr or list of str, optional
Stock symbols to be uploaded. The default is [].
- sdatedate like, optional
The start date of historical time series. The default is “2012-01-01”.
- edatedate like, optional
The end date of historical time series (must: sdate >= edate) The default is ‘today’.
- calendarstr or numpy.busdaycalendar, optional
Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.
- output_formatstr, optional
- The function output format. It can be:
‘frame’ - pandas.DataFrame
‘dict’ - dict of pandaws.DataFrame. The symbols are the keys.
The default is ‘frame’
- sourcestr or dict, optional
If it is a str, then it represents the market data provider for all historical prices request. Possible values are: ‘yahoo’, ‘alphavantage’, ‘alphavantage_yahoo’, ‘eodhistoricaldata’, ‘eodhistoricaldata_yahoo’ and ‘marketstack’. If set to None it will default to ‘yahoo’.
It can be set to a dict containing specific instructions for each stock symbol. The dict keys are the symbols and the values are ‘dict’ instructions specific to each symbol. Valid keys for the instructions dict are the names of this function call variables except ‘sdate’, ‘edate’, ‘calendar’ and ‘output_format’. The actual set of stock symbols is given by the union of variable ‘symbol’ and the keys of the dict ‘source’. Missing values in the symbol instruction dict’s will be filled with the values of the function call variables. The values of the function call variables act as generic values to be used in absence of specific instructions in the ‘source’ dict. The default is None.
Example of dict ‘source’:
source = {‘AAPL’: {‘source’: ‘eodhistoricaldata, ‘verbose’: True}, ‘SPY’: {‘source’: ‘yahoo’, ‘force’: True}}
In this case there are 2 symbols that will be added (union) to the set of symbols defined by ‘symbol’ variable. For symbol ‘AAPL’ the provider source is eodhistoricaldata and the ‘verbose’ instruction is set to True. The rest of the instructions: ‘force’, ‘save’, ‘file_dir’, ‘file_format’, ‘api_key’ and ‘param’ are set to the values of the corresponding function call variables. Similar for symbol ‘SPY’. The instructions for the rest of the symbols that may be specified in the ‘symbol’ variable will be set according to the values of the function call variables.
- forceBoolean, optional
True: will try to collect historical prices exclusive from the market data providers.
False: first it will try to load the historical prices from a local saved file. If such a file does not exist the market data provider will be accessed.
If the file exists but the saved historical data is too short then it will try to collect the missing values only from the market data provider. The default is False.
- saveBoolean, optional
True: It will try to save the historical price collected from the providers to a local file.
False: No attempt to save the data is made.
The default is True.
- file_dirstr, optional
Directory with (to save) historical market data. If it does not exists then it will be created. The default is “outDir”.
- file_formatstr, optional
The saved file format for the historical prices. The following files formats are supported: csv, json and feather The default is ‘csv’.
- api_keystr, optional
Provider API key (where is required). If set to None then the API key is set to the value of global environment variables
APLPHAVANTAGE_API_KEY for alphavantage,
EODHISTORICALDATA_API_KEY for eodhistoricaldata,
MARKETSTACK_API_KEY for marketstack.
The default is None.
- paramdict, optional
Set of additional information to access the market data provider. At this point in time only accessing alphavantage provider requires an additional parameter specifying the maximum number of API (symbols) requested per minute. It varies with the level of access corresponding to the API key. The minimum value is 5 for a free key and starts at 75 for premium keys. This value is stored in max_req_per_min variable.
Example: param = {‘max_req_per_min’: 5}
This is also the default vale for alphavantage, if param is set to None. The default is None.
- imputationstr, optional
- Method to fill missing data. Valid values are,
“linear” - filling with linearly interpolated values. The missing data at the ends of the time-series are not modified.
None - no imputation. However, missing data may halt further computations.
The default is None.
Note: It is recommended to call the function without imputation set and to analyze the quality of the raw data. Apply an imputation algorithm only if the amount of missing data is small and in non critical areas of the time series. In general, any imputation methodology will introduce bias in the final evaluations.
- verboseBoolean, optional
If set to True, then additional information will be printed during the loading of historical prices, prior to imputation. The default is True. Note: The quality of the market data after an imputation may be asses using azapy function summary_MkTdata.
- Returns:
- `pandas.DataFrame` or ‘dict’ `pandas.DataFrame`Historical market data.
The output format is designated by the value of the input parameter output_format.
azapy.MkT.update_MkTdata module¶
- azapy.MkT.update_MkTdata.update_MkTdata(mktdir, source=None, api_key=None, param=None, except_file=[], verbose=True)
Updates all mkt data saved in a directory.
- Parameters:
- mktdirstr
Mkt data directory.
- sourcestr, optional
Mkt data provider. For more details see the azapy.MkTreader.get function doc. The default is None.
- api_keystr, optional
Mkt data provider API key. For more details see the azapy.MkTreader.get function doc. The default is None
- paramdict, optional
Additional parameters required by mkt data provider. For more details see the azapy.MkTreader.get function doc. The default is None.
- except_filelist, optional
List of symbols to be omitted from the update. The default is [].
- verboseBoolean, optional
True will print a progress report,
False suppress any printing to the terminal.
The default is True.
- Returns:
- intError code
200 : successful, everything updated
201 : some (or all) were not completely updated
101 : the mktdir does not exists
102 : unsupported mkt data sources
Notes
Files with unsupported extensions (see azapy.MkTreader.get function) are silently omitted from the update.
azapy.MkT.summary_MkTdata module¶
- azapy.MkT.summary_MkTdata.summary_MkTdata(mktdata, calendar=None, sdate=None, edate=None)
Summary of MkT data time-series length and quality (checks for missing records).
- Parameters:
- mktdatapandas.DataFrame or a dict of pandas.DataFrame
Market Data in the format returned by azapy.readMkT function.
- calendarstr or numpy.busdaycalendar, optional
Business calendar. It can be the exchange calendar name as a str or a numpy.busdaycalendar object. If it is None then it will be set to NYSE business calendar. The default value is None.
- sdatedate like, optional
Time-series start date. If it is None then sdate will be set to the earliest date in mktdata. The default is None.
- edatedate like, optional
Time-series end date. If it is None then edate will be set to the most recent date in mktdata. The default is None.
- Returns:
- `pandas.DataFrame`A table with columns:
symbol : time-series symbol
begin : start date
end : end date
length : number of records
na_total : total number of nan
na_b : number of missing records at the beginning
na_e : number of missing records at the end
cont : total number of missing records
Notes
Its main application is to assess the missing data in the time-series extracted with azapy.readMkT function.