Hi everyone, this is the first article for the 【How 2】 column.

Every time that I have questions that popped into my mind, I always go Google and try the luck. It would take quite some time to filter the outdated answers, situation not applied answers, …etc. so that I can start forming the answer that help solve my question.

So, here’s the corner for accumulating all these small notes that would help people who have the same questions of “How to …..”

Let’s get started.

Intro

While working on the trading automation bot, there’s one common factor that is essential to acquire for all the stocks that are considered as your potential targets. That is stock price.

There are many third party service and data warehouse that allows you to fetch the stock hloc (high, low, open, close) price through their APIs. The format of these APIs go by:
yfinance

import yfinance as yf

msft = yf.Ticker("MSFT")

# get historical market data
hist = msft.history(period="max")

tiingo

import requests
headers = {
    'Content-Type': 'application/json'
}
requestResponse = requests.get("https://api.tiingo.com/tiingo/daily/MSFT/prices?startDate=2019-01-02&token=<your_token>", headers=headers)
print(requestResponse.json())

quandl

import quandl

stock_tickers = [
    'MSFT',
]
mydata = quandl.get(stock_tickers, start_date = '2019-03-19', end_date='2019-03-21')

mydata.loc[:,(mydata.columns.str.contains('Close'))].T

If you have a pair of good eyes, you’ll notice what we’re trying to tackle here. All the APIs are called by given the ticker of the stock.

Ticker is a brief symbol or code to represent a specific stock/company. So before using your preferable APIs to get the price data, you need to know the ticker of the stock beforehand. As we’re working with code, we would like to get a group or a list of tickers to feed to the code that will automatically start processing the data for us.

So the question today would be:

“How to get a list of tickers listed in NYSE or Nasdaq?”

Solution

As stated in the article [How to get all common stock tickers], that actually Nasdaq is maintaining the list of the listed stocks and all the preliminary data in the text files. So the idea would be get the content from remote FTP, and parse those content into the format we need.

Example:
nasdaqlisted.txt

Symbol	Security Name	Market Category	Test Issue	Financial Status	Round Lot Size	ETF	NextShares
AACG	ATA Creativity Global - American Depositary Shares, each representing two common shares	G	N	N	100	N	N
AACQ	Artius Acquisition Inc. - Class A Common Stock	S	N	N	100	N	N

cont…

otherlisted.txt

ACT Symbol	Security Name	Exchange	CQS Symbol	ETF	Round Lot Size	Test Issue	NASDAQ Symbol
A	Agilent Technologies, Inc. Common Stock	N	A	N	100	N	A
AA	Alcoa Corporation Common Stock	N	AA	N	100	N	AA
AAA	Listed Funds Trust AAF First Priority CLO Bond ETF	P	AAA	Y	100	N	AAA

cont…

Apparently, we need a list of tickers. That shouldn’t be hard. However, as this the primitive data maintained by Nasdaq, there are something we need to pay attention to before processing the data.

We need to screen out the Test Issue stocks that is actually not a real company
To save time later while we processing the stock fundamental data to find out the better quality stock, we can first remove those companies whose financial status are either bankrupt or deficient.
We remove the stocks that are not listed in our target exchange
We remove the stocks that have either . or $ in their symbol. These are not the stocks that listed in the market that we’re paying attention to.

Two methods: Bash v.s. python

In Bash

echo "[\"$(echo "$(
    echo -en "$(
            curl -s --compressed 'ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt' | tail -r | tail -n+2 | tail -r | tail -n+2 | perl -pe 's/ //g' | tr '|' ' ' | awk '{printf $1" "} {print $4}'
        )\n$(
            curl -s --compressed 'ftp://ftp.nasdaqtrader.com/SymbolDirectory/otherlisted.txt'  | tail -r | tail -n+2 | tail -r | tail -n+2 | perl -pe 's/ //g' | tr '|' ' ' | awk '{printf $1" "} {print $7}'
        )" | grep -v 'Y$' | awk '{print $1}' | grep -v '[^a-zA-Z]' | sort
    )" | perl -pe 's/\n/","/g')\"]

In Python

import pandas as pd
import subprocess
from io import StringIO

def __get_tradable_tickers() -> list:
    # * Get all the text from Nasdaq
    proc = subprocess.Popen([
            "curl", "-s", "--compressed", "ftp://ftp.nasdaqtrader.com/SymbolDirectory/nasdaqlisted.txt"
        ],
        stdout=subprocess.PIPE
    )
    nasdaq_text, err = proc.communicate()

    # * Get all the text from Nasdaq
    proc = subprocess.Popen([
            "curl", "-s", "--compressed", "ftp://ftp.nasdaqtrader.com/SymbolDirectory/otherlisted.txt"
        ],
        stdout=subprocess.PIPE
    )
    other_text, err = proc.communicate()

    # * Convert them into DataFrame
    nasdaqlisted = pd.read_csv(StringIO(nasdaq_text.decode('utf-8')), sep='|', header=0)
    otherlisted = pd.read_csv(StringIO(other_text.decode('utf-8')), sep='|', header=0)

    # * Remove the
    # * 1. test issue stocks,
    # * 2. not in GM market,
    # * 3. financially broke company,
    # * 4. ticker that has "." or "$"
    # * from the DataFrame
    nasdaqlisted = nasdaqlisted[\
        (~nasdaqlisted['Test Issue'].str.contains('Y', na=False))&\
        (~nasdaqlisted['Symbol'].str.contains('\.|\$', na=False, regex=True))&\
        (nasdaqlisted['Market Category'].eq('G'))&\
        (nasdaqlisted['Financial Status'].eq('N'))
    ]

    # * Remove the
    # * 1. test issue stocks,
    # * 2. not in NYSE,
    # * 3. ticker that has "." or "$"
    # * from the DataFrame
    otherlisted = otherlisted[\
        (~otherlisted['Test Issue'].str.contains('Y', na=False))&\
        (~otherlisted['ACT Symbol'].str.contains('\.|\$', na=False, regex=True))&\
        (otherlisted['Exchange'].isin(['A', 'N', 'P']))
    ]

    # * To remove the duplicated tickers by using set
    tickers = list(set(nasdaqlisted['Symbol'].values.tolist() + otherlisted['ACT Symbol'].values.tolist()))

    return tickers

Take away

You can take away as many as possible. The codes in the pictures are free to use.

Reference

How to get all common stock tickers

Bloomberg source

Nasdaq FTP folder

Nasdaq Market tiers & categories

Michael's blog

【How 2】 Vol. 1. How 2 get all tradable tickers in US markets