CTRL+ALT+JOPX: Pandas

Showing posts with label Pandas. Show all posts

Sunday, December 28, 2025

Quick tip: retrieving raw, unprocessed files stored in Github

The domain raw.githubusercontent.com serves raw, unprocessed files stored in Github repositories - think of it as GitHub's "direct file download" backend. Tools like curl, wget or Python scripts can fetch files directly using a specific url composed of the user name, repo, branch, folder and filename

curl -O https://raw.githubusercontent.com/[user]/[repo]/[branch]/[folder]/[filename]

example: curl -O https://raw.githubusercontent.com/jorisp/tradingnotebooks/master/data/DJI.csv

You can also read these raw files form GitHub using pandas.read_csv

Tuesday, December 23, 2025

Interactive Pandas dataframes with ITables 2.0

ITables is a Python package available on Github (maintained by Marc Wouts) which changes how Pandas and Polars dataframes are rendered in Python notebooks and other Python applications. It works out of the box without any dependencies in Jupyter, Dash, Streamlit and Marimo.

For more info see:

Enable and disable Data Wranger in VS Code for Pandas DataFrame and Series

To disable default rendering of Pandas DataFrames in VS Code Jupyter Notebook with Data Wrangler (after installing the extension) - follow these steps:

Open Command Palette : Preferences: Open Settings (UI)
Search for: Data Wrangler
Uncheck the Data Wrangler>Output Renderer: Enabled Types for Pandas Dataframe and Series to disable the rendering and check them again when needed

Related links:

Getting started with Data Wrangler in VS Code

Wednesday, August 06, 2025

Divididend analysis of Telenor ASA using Jupyter Notebook

Cross posted from Divididend analysis of Telenor ASA using Jupyter Notebook

I just published the notebook dividends.ipynb on my GitHub repository jorisp/tradingnotebooks which shows how dividends contribute to the total return. This notebook uses the yfinance API to retrieve the data. I used Telenor ASA (a Norwegian telecom operator) as an example.

If you are considering to invest in foreign dividend stocks as a Belgian investor, you need to keep in mind the double taxation of dividends. Even with a withholding tax applied abroad, the Belgian government will tax your dividend again at a flat rate of 30%.

Disclaimer: The information on this blog is intended solely for informational and educational purposes. I am not a certified financial advisor, and the content provided here does not constitute professional financial advice. (Full disclaimer)

Sunday, July 14, 2024

Quick note: Python date and time objects

In Python, a naive datetime object is one that does not contain any information about time zones or daylight saving time. This means it is unaware of the context in which it exists, such as whether it represents local time, UTC, or any other time zone. By default, the datetime object in Python is naive. You can make them timezone aware using the pytz library.

If you are working with pandas dataframes or series, you can also use the tz_localize method of the Pandas DateTimeIndex object.

References:

Tuesday, April 18, 2023

Looking at historical returns of stocks and bonds with Power BI and Python

You might already have seen below graph taken from a study by JP Morgan Asset Management, but what if you would like to look at historical returns without going through the hassle of having to collect all the data yourself?

There is an interesting Excel sheet shared by Aswath Damodaran (@AswathDamodaran) that you can download from Historical Returns on Stocks, Bonds and Bills: 1928-2022 which looks at returns of different asset classes (stocks, bonds, bills, real estate and gold) over a longer time period.

In this post I will share some tips on how you can use this data in Power BI, Python and Jupyter notebooks.

This Historical Returns on Stocks, Bonds and Bills: 1928-2023 - Excel file file is updated in the first two weeks of every year and it is being maintained by Aswath Damodaran, who is a professor of Finance at the Stern School of Business at NYU, he is also known as the "Dean of Valuation" due to his experience in this area.

Visualizing S&P 500 and US Treasury bond returns using Power BI

I first converted the Excel from xls to xlsx format and afterwards it is quite easy to import the data from an Excel workbook files in Power BI . It is quite easy to visualize the returns of both stocks and US treasury bonds using a clustered column chart - I also added a minimum line for both stock and bond returns.

Expected risk and expected return should go hand in hand: the higher the expected return, the higher the expected risk. Risk means means that the future actual return may vary from the expected return (and the ultimate risk is loosing all of your assets). The first visual showed a 20-year annualized return between 1999 and 2018 for the S&P 500 of 5.8%. Average returns hide however the big swings in yearly returns - e.g. in 2008 (the Great Financial Crisis), the S&P 500 had a -36.5% yearly return. Bonds on average have a lower return but also have a lower risk profile.

The basic rule of thumb is to keep your “safe money” (i.e., money you don’t want to risk in stocks) in high-quality bonds. While this doesn’t give you 100% protection against losses at all times, it does provide you some peace of mind. I really like this quote: "If you can't sleep at night because of your stock market position, then you have gone too far. If this is the case, then sell your positions down to the sleeping level. (Jesse Livermoore)"

As you can see in the visualization below, in most years with a negative return for the S&P 500, the return for bonds is positive - with two notable exceptions 1969 and 2022. A common saying is to have your age in bonds. Using that general rule, a 45-year-old might have 45% of the total portfolio in bonds. If you want to more aggressive, you would have less than your age in bonds. The last decade with interest rates very low (or even negative) this probably wasn't a very profitable asset allocation but

things might have shifted.

The US Treasury Bond used in the Excel file is the 10-year US treasury bond for which you can download the data from FRED . The yearly return has been calculated by taking the yield and the price change for a par bond with that specific yield.

You can download the Power BI file histreturns.pbix from my Power BI Github repo

In the long run (see example below for different rolling windows from a 1-year to a 20 year period) stocks will outperform bonds but this again works with averages and it ignores the tail risk which might wreak havoc in your portfolio.

Reading data from Excel using Python

Now let's take a look at how you can read and manipulate the data in this Excel sheet using Python. To read an excel file as a DataFrame, I will use the pandas read_excel() method. Internally, Pandas. .read_excel uses a library called xlrd which you also need to install but I used the openpyxl library as an alternative which also works. So before you can read an excel file in pandas, you will need to install

the openpyxl library.

The above code reads only the table with data from the Excel file (which I downloaded in a subdirectory data from the Jupyter notebook) - see pandas.read_excel in the Pandas referencel documentation for full details:

sheet_name: can be an integer (for the index of a worksheet in an Excel file, default to 0, the first sheet) or the name of the worksheet you want to load
nrows: number of rows to read
skiprows: number of rows to skip
usecols: by default all columns will be read but also possible to pass in a list of columns to read into the dataframe like in the example

I just started exploring some data around stock-bond correlations and will be updating the Juypyter notebook on Github - https://github.com/jorisp/tradingnotebooks/blob/master/HistoricalReturns.ipynb

A couple of weeks ago I noticed this interesting tweet on rolling one-year-stock-bond correlations for six regimes from @WifeyAlpha - I think it would make an interesting exercise to see how to rebuild this using Python.

References:

Related posts:

Friday, August 12, 2022

Using the yFinance Python package to download financial data from Yahoo Finance - part 2

In a previous post I showed how you can download ticker data from Yahoo Finance using the yFinance Python package. I now updated the Jupyter notebook code sample using YFinance to also show how you can retrieve additional information (sector, industry, trailing and forward earnings per share, etc...). The Ticker class in the yFinance library contains the info property which returns a dictionary object ( a collection of key-value pairs where each key is associated with a value) which allows you to access specific information about an asset.

Since I wanted to know how fast data retrieval would be I also include the %%time magic command . Wall clock time measures how much time has passed. CPU time is how many (milli)seconds the CPU was busy.

Yahoo Finance contains data about stocks, Exchange Traded Funds (ETF), mutual funds and stock market indices - the information that you can retrieve for each of these differs, so it is safe to check in your code for the quoteType. Below example retrieves information about Apple stock, the iShares MSCI AWCI UCITS ETF (Acc) and a thematic mutual fund from KBC.

I also included a code snippet which shows how to retrieve this information for multiple assets and convert this into a Pandas dataframe.

Thursday, July 21, 2022

Using the yFinance Python package to download financial data from Yahoo Finance

In a previous post I explained how you can retrieve data from Yahoo Finance using Python and Pandas Datareader - an alternative Python library for retrieving data from Yahoo Finance is yFinance maintained by Ran Aroussi.

If you are using conda package manager, you will notice that you can not install yfinance using conda so you will need to revert to pip install yfinance. All documentation is available on yFinance as well as on https://github.com/ranaroussi/yfinance but I also uploaded a Jupyter notebook code sample on my Github - https://github.com/jorisp/tradingnotebooks/blob/master/yfinance_sample.ipynb

Sunday, June 12, 2022

Using Python and Pandas Datareader to retrieve financial data - part 3: Yahoo Finance

Yahoo Finance is one of the most popular sources of free financial data. It does not only contain historical data but also financial statements, dividend information and calculated metrics like e.g. 50 and 200 day moving average, beta, etc ... Yahoo Finance does not have an officially supported API anymore but pandas-datareader still allows you to access the data from Yahoo Finance in Python (other alternatives are yfinance and yahoo_fin).

This post is part of a series on using Pandas datareader to retrieve financial data:

In this post I have used version 0.10.0 of pandas-datareader (released July 13, 2021) which is currently working with Yahoo Finance - previous versions of pandas-datareader had to be updated after Yahoo made some changes on the underlying API.

Warning: Accessing Yahoo Finance using Python libraries is quite brittle so don't try to built production trading systems using this data source.

Accessing the Yahoo Finance API using pandas-datareader is very simple as shown in the screenshot below but I would also recommend implementing a cache mechanism for your queries using the requests-cache Python library to avoid having your IP address being banned. The full source of this Jupyter notebook is available at https://github.com/jorisp/tradingnotebooks/blob/master/YahooFinancesingle.ipynb

References:

Thursday, May 19, 2022

Using Python and Pandas Datareader to retrieve financial data - part 2: Fama & French data library

This post is part of a series on using Pandas datareader to retrieve financial data:

In this post we will look at the datasets made available by Eugene Fama and Kenneth French. Eugene Fama and Kenneth French did a lot of research on which factors drive security returns. In 1993, they published the Three Factor Model (see article "Common risk factors in returns of stocks and bonds", Journal of Financial Economics 33, 1993), which showed that their factors (size of the firm, book-to-market values and excess return) capture a statistically significant fraction of the variation of stock returns. In 2014, Fama and French adapted their model to include five factors. Fama won the Nobel Prize for Economics in 2013 for his research. Fama also published a number of papers on the Efficient Market Hypothesis and random walk theory.

Fama and French still publish the returns of various investment factors analyzed by them on their homepage on a regular basis. You can download this data using the pandas_datareader library - you can take a look at the official documentation, Fama-French Data (Ken French's Data library) to get started or take a look at the Jupyter notebook that I shared on Github https://github.com/jorisp/tradingnotebooks/blob/master/FAMA.ipynb

References:

Wednesday, May 04, 2022

Using Python and Pandas Datareader to retrieve financial data - part 1: Federal Reserve Data (FRED)

The pandas-datareader Python library covers a number of APIs with global fundamental macro- and industry data sources including the following (for a full list see Pandas Datareader - data sources ):

St. Louis FED (FRED): Federal Reserve data on the U.S. economy and financial markets
Fama/French data library : market data on portfolios capturing returns on key risk factors like size, value, and momentum by industry
Yahoo Finance : retrieve daily stock prices, historical corporate actions (dividends and stock splits) from Yahoo Finance
World Bank: global database with economic/social indicators and demographics.

This post is part of a series on using Pandas datareader to retrieve financial data:

In this post I will focus on retrieving data from FRED using pandas-datareader. Federal Reserve Economic Data (FRED) - https://fred.stlouisfed.org/ is a database maintained by the Federal Reserve Bank of St. Louis. It has more than 800.000 data time series covering categories such as Economic growth & employment, monetary & fiscal policy, demographics, industries, commodity prices at different frequencies (daily, monthly, annual). One of the interesting time series you can find here are 3-month Treasury Bill Secondary Market rate (TB3MS) or 1-year US Treasury bills which are used a proxy for the risk free rate in financial modeling.

There is however some missing data on the TB1YR - so I will be using the TB3MS (3 Month) in my next example. You will notice that all time series are identified by a short abbreviation that you can find by searching on the FRED website.

Full source code is available at https://github.com/jorisp/tradingnotebooks/blob/master/FRED.ipynb

References:

FRED API: get US economic data which drives the real estate market
Pandas-datareader python library

Wednesday, April 22, 2020

Working with multiple time series trading data from Quandl in Jupyter Notebooks

In the previous example - Using Euronext stock data from Quandl in Jupyter notebooks I downloaded a single dataset from Quandl. But it is also possible to download multiple datasets by passing in a list of Quandl codes.

In the example below, I downloaded the prices of a number of diversified holding companies which are traded on Euronext Brussels and compared the cumulative returns (not including dividend payments) using Jupyter Notebooks.

The Quandl Python API allows you to make a filtered time series call and request only specific
columns - in this example the 'Last' (Closing price) is retrieved by specifying the index 4. In a next
step I renamed the columns in the pandas dataframe to make it easier to work with the data
afterwards.

Take a look at the full python notebook at https://github.com/jorisp/tradingnotebooks/blob/master/Quandl_Belgian_Holdings-Shared.ipynb to see how this data can be used to visualize cumulative returns for these different stocks

%matplotlib inline
import quandl
import matplotlib.pyplot as plt


quandl.ApiConfig.api_key = "<Your Key Here>"

#Retrieve Last price only for the 5 holdings (excluding mono holdings) trading on Euronext Brussels
#Data is available from February 2014 onwards - Ackermans Van Haren (ACKB), Brederode (BREB), Sofina (SOF), 
#GBL and Bois Sauvage (COMB )
data = quandl.get(['EURONEXT/ACKB.4','EURONEXT/BREB.4','EURONEXT/SOF.4','EURONEXT/GBLB.4','EURONEXT/COMB.4'])

#Rename column names 
data.rename(columns={'EURONEXT/ACKB - Last': 'ACKB', 'EURONEXT/BREB - Last': 'BREB','EURONEXT/SOF - Last':'SOF',
                     'EURONEXT/GBLB - Last':'GBLB','EURONEXT/COMB - Last':'COMB'},inplace=True)

Monday, April 20, 2020

Using Euronext stock data from Quandl in Jupyter notebooks

The last couple of weeks I have been learning about Python and how to use it for stock and derivative trading. One of the challenges is getting stock trading data for European stocks (without having to pay for it). One of the first things I started with is using Jupyter notebooks to quickly visualize stock market information.

The easiest way to get started with Jupyter is using an all-in-one Python distribution - the one I used is Anaconda since it is easy to setup and it includes a number of interesting libraries I want to use in next steps.

I like to try out things hands-on but I did use a number of training resources to get up to speed:

To get trading data about European stocks I used Quandl. Quandl is a marketplace for financial and economic data which is either freely available or requires a paid subscription. Data is contributed by multiple data publishers like World Bank, trading exchanges and investment research firms. Quandl provides REST API access to the available data sets but also has specific Python and R libraries. You first need to register to get an API key. A lot of European stocks are traded on Euronext and Quandl provides you access to Euronext data - https://www.quandl.com/data/EURONEXT-Euronext-Stock-Exchange

Install the quandl Python package using the Anaconda command prompt. It is best to setup virtual environments to manage separate package installation that you need for a particular project, isolating the packages in other environments but for simplicity I just installed in the base environment.

Next it is quite easy to retrieve stock data from Quandl - you first import the quandl package and next you call the quandl.get() method. By default, Quandl will retrieve the dataset into a pandas DataFrame. Since I specified no additional parameters, the entire timeseries dataset was retrieved - from February 2014 until now. Afterwards I used the plot command which uses the matplotlib library to display a graph of the closing prices.

For the full Jupyter notebook take a look at Github https://github.com/jorisp/tradingnotebooks/blob/master/Quandl_API_Euronext_ABI_Shared.ipynb