Download Financial Dataset Using Yahoo Finance in Python | A Complete Guide

Arnab Mondal 02 Dec, 2022 • 6 min read

This article was published as a part of the Data Science Blogathon

Introduction

The article aims to empower you to create your projects by learning how to create your data frame and collect data about the stock market and the crypto market from the internet and then base your code on it. This will allow you to create your ML models and experiment with real-world data.

In this article, I will demonstrate two methods and both use Yahoo Finance Python as the data source since it is free and no registration is required. You can use any other data source like Quandi, Tiingo, IEX Cloud, and more.

 

Getting Ready

In the first approach, we will consider the finance module in python and it is a very easy module to work with. The other module we will talk about is yahoofinancials which requires extra effort but gives back a whole lot of extra information in return. We will discuss that later and now we will begin by importing the required modules into our code.

Initial Setup:

We need to load the following libraries:

import pandas as pd
import yfinance as yf
from yahoofinancials import YahooFinancials

If you do not have these libraries, you can install them via pip.

!pip install yfinance
!pip install yahoofinancials

First Method: How to use yfinance

It was previously known as ‘fix_yahoo_finance’ but later it transformed into a module of its own but it is not an official one by Yahoo. The module ‘yfinance’ is now a very popular library that is very python friendly and can be used as a patch to pandas_datareader or a standalone library in itself. It has many potential uses and many people use it to download stock prices and also crypto prices. Without any further delay, let us execute the following code. We will begin by downloading the stock price of ‘Apple’

Code :

Output :

 

The data interval is set to 1 day but the internal can be externally specified with values like 1m,5m,15m,30m,60m,1h,1d,1wk,1mo, and more. The above command for downloading the data shows a start and an end date but you can also simply download the data with the code given below :

Code :

aapl_df = yf.download('AAPL')

Output :

There are many parameters of the download function which you can find in the documentation and start and end are some of the most common ones to be used. Since the data was small, the progress bar was set to false and showing it makes no sense and should be used for high volume or data.

We can also download multiple stock prices of more than one asset at one time. By providing a list of company names in a list format ( eg. [‘FB’,’ MSFT’,’AAPL’] )as the tickers argument. We can also provide an additional argument which is auto-adjust=True, so that all the current prices are adjusted for potential corporate actions like splits.

Apart from the yf.download function, we can also use the ticker module and you can execute the below code to download the last 5year stock prices of Apple.

Code :

ticker = yf.Ticker('AAPL')
aapl_df = ticker.history(period="5y")
aapl_df['Close'].plot(title="APPLE's stock price")

Output :

 

The one advantage of using a ticker module is that the multiple methods which are connected to it can be exploited. The available methods we can use are :

  • info – This method prints out a JSON formatter output which contains a lot of information about the company starting from their business full name, summary, industry, exchanges listed on with country and time zone, and more. It also comes equipped with the beta coefficient.

  • recommendations – This method contains a historical list of recommendations made by different analysts regarding the stock and whether to buy sell or give suggestions on it.

 

  • actions – This displays the actions like splits and dividends.

 

  • major_holders – This method displays the major holders of the share along with other relevant details.

 

  • institutional_holders – This method shows all the institutional holders of a particular share.

 

  • calendar – This function shows all the incoming events such as the earnings and you can even add this to your google calendar through code. Basically, it shows the important dividend dates for a company.

If you still want to explore more regarding the working of the functions, you can check out this GitHub repository of yfinance.

Second Method: How to use yahoofinancials?

The second method is to use the yahoofinancials module which is a bit tougher to work with but it provides much more information than yfinance. We will begin by downloading Apple’s stock prices.

To do this we will first pass an object of YahooFinancials bypassing the Apply ticker name and then use a variety of important information to get out the required data. Here the returned data is in a JSON format and hence we do some beautification to it so that it can be transformed into a DataFrame to display it properly.

Code :

yahoo_financials = YahooFinancials('AAPL')
data = yahoo_financials.get_historical_price_data(start_date='2019-01-01', 
                                                  end_date='2019-12-31', 
                                                  time_interval='weekly')
aapl_df = pd.DataFrame(data['AAPL']['prices'])
aapl_df = aapl_df.drop('date', axis=1).set_index('formatted_date')
aapl_df.head()

Output :

 

Coming down on a technical level, the process of obtaining a historical stock price is a bit longer than the case of yfinance but that is mostly due to the huge volume of data. Now we move onto some of the important functions of yahoofinancials.

  • get_stock_quote_type_data() – This method returns a lot of generic information about a stock which is similar to the yfinance info() function. The output is something like this.

  • get_summary_data() – This method returns a summary of the whole company along with useful data like the beta value, price to book value, and more.

  • get_stock_earnings_data() – THis method returns the information on the quarterly and yearly earnings of the company along with the next date when the company will report its earnings.

  • get_financial_stmts() – This is another useful method to retrieve financial statements of a company which is useful for the analysis of a stock

  • get_historical_price_data() – This is a method similar to the download() or Ticker() function to get the prices of stock with start_date, end_date and interval ranges.

The above module can also be used to download company data at once like yfinance and cryptocurrency data can also be downloaded as shown in the following code.

Code :

yahoo_financials = YahooFinancials('BTC-USD')
data=yahoo_financials.get_historical_price_data("2019-07-10", "2021-05-30", "monthly")
btc_df = pd.DataFrame(data['BTC-USD']['prices'])
btc_df = btc_df.drop('date', axis=1).set_index('formatted_date')
btc_df.head()

Output :

 

For more details about the module, you can check out its GitHub Repository.

EndNotes

The full information is ultimately sourced from Yahoo Finance and now you know how to import yahoo finance into python and how to import any stock or cryptocurrency price and information dataset into your code and begin exploring and experimenting with them. Good luck with your adventures and feel free to share your code with me on LinkedIn or feel free to reach out to me in case of any doubts or errors.

Thank you for reading till the end. Hope you are doing well and stay safe and are getting vaccinated soon or already are.

About the Author :

Arnab Mondal

Data Engineer & Python Developer | Freelance Tech Writer

Link to my other articles

Arnab Mondal 02 Dec 2022

Just a guy who loves to code and learn new languages and concepts

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Manny
Manny 15 Jun, 2021

I've been using yfinance for almost a year and I am grateful to be able to easily get accurate historical stock prices at no cost: thank you for the article, and thanks to yahoo finance, the author of the module and those who maintain it. I may be missing something about this module, or other, but it seems today's closing price is not made available until a later time (not sure what time that is) as opposed to the actual end of day. As a result, to get today's closing price with the data download, I have been downloading the historical data the next morning. I appreciate your thoughts. Thanks, Manny

Cleveland
Cleveland 05 Feb, 2022

This platform of teaching Computer language Python is contain on so many interesting and helpful data . I learn many programs from there and use these in my project development .

alightpro
alightpro 17 Jun, 2022

Hi, Thanks for sharing the nice information. It is a very nice and useful article. Your information is very helpful thanks those who want to take online it certificate programs

Data Visualization
Become a full stack data scientist