Skip to content

adasThePrime/pyahoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pyahoo

A full-featured Python wrapper for Yahoo Finance API with synchronous and asynchronous clients, real-time WebSocket streaming, rich type hints, dataclasses, and enums.

Tip

Just the tip of the iceberg: The README only scratches the surface of Pyahoo's capabilities. The library is comprehensively typed and internally documented. Your IDE (VS Code, PyCharm, etc.) will autocomplete everything. Hover over any method or class in your editor to see full docstrings, available parameters, and exact return types!

Warning

These APIs are unofficial and undocumented. Yahoo may throttle, block, or change them at any time. Use at your own risk and respect Yahoo's Terms of Service.

Table of Contents

Installation

From PyPi:

pip install pyahoo

# With WebSocket support
pip install pyahoo[websocket]

From source:

git clone https://github.com/adasThePrime/pyahoo.git
cd pyahoo
pip install -e ".[websocket]" # -e for editable mode

Quick Start

Sync

The YahooFinance client allows you to fetch data synchronously. It's recommended to use it as a context manager (with block) to ensure the underlying HTTP session is properly closed.

from pyhaoo import YahooFinance

with YahooFinance() as yf:
    quotes = yf.quote("AAPL", "NVDA", "MSFT")
    for q in quotes:
        print(q) # "Apple Inc. (AAPL) $255.63 +0.73% [PRE]"

    # Historical chart data
    chart = yf.chart("AAPL", range="1mo", interval="1d")
    print(chart)  # "ChartResult(AAPL 22 bars, 0 divs, 0 splits)"
    for bar in charts: # ChartResult is iterable
        print(bar) # "OHLCV(ts=1711929600 O=171.0 H=173.5 L=170.2 C=172.8 V=52340000)"

Async

For high-performance applications, Pyahoo provides an identical asynchronous API via AsyncYahooFinance. It's drop-in replacement that uses asyncio, allowing concurrent requests.

import asyncio
from pyahoo import AsyncYahooFinance

async def main():
    async with AsyncYahooFinance() as yf:
        quotes = await yf.quote("AAPL", "NVDA")
        chart = await yf.chart("AAPL", range="1mo", interval="1d")
        print(f"AAPL: {len(chart)} bars")
    
asyncio.run(main())

Configuration

Clients can be fully tailored to specific localization settings upon instantiation. Setting the region or lang ensures you receive culturally relevant symbols or data fields that Yahoo otherwise filters based on your IP address. Additional options like customizing timeouts ensure your program doesn't hang indefinitely on slow connections.

yf = YahooFinance(
    lang="en-US",
    region="US",
    timeout=30,
    proxy="https://proxy:8080"
)

# The client is also usable without context manager
nvda = yf.quote("NVDA")
yf.close() # Don't forget to close

Model Representation

The data returned by Pyahoo isn't just raw JSON dicts-they are strongly-typed dataclasses. This provides benefits like: autocomplete in your IDE, static type checking, and human-friendly string representations. Models automatically format themselves beautifully when printed, and collection-like models support iteration iter(), length len(), and truthiness checks bool().

from pyahoo import YahooFinance

with YahooFinance() as yf:
    # Quotes
    quote = yf.quote("AAPL") # Directly returns Quote object
    print(quote)

    # Options
    chain = yf.options("AAPL")
    print(chain) # OptionsChain(AAPL $255.63 53 calls, 49 puts, 23 expirations)
    print(len(chain)) # 102 (total contracts)

    # Screeners
    result = yf.screener("MOST_ACTIVES")
    print(result) # Screener(Most Actives: 25/329 quotes)
    for q in result: #ScreenerResult is iterable
        print(q)

    # Search
    search = yf.search("Tesla")
    print(search) # SearchResult(6 quotes, 3 news)
    if search: # Truthy if results exist
        print(search) # "TSLA: Tesla, Inc. (EQUITY)"

WebSocket

Pyahoo provides real-time price streaming via Yahoo Finance's WebSocket API. Incoming data is Protocol Buffer (protobuf) encoded and base64-wrapped, Pyahoo decodes it automatically into PricingData dataclass objects for easy consumption.

Symbol format: Pass symbols exactly as Yahoo expects them. Indices use a ^ prefix ("^GSPC", "^DJI", "^VIX"); equities, crypto, ETFs, and future do not ("AAPL", "BTC-USD", "SPY", "ES=F").

Sync WebSocket

The WebSocket class provides a blocking streaming client. Use it with callbacks or as a blocking iterator. By default, when used as a context manager, it waits for the underlying connection to be established and fails fast if the connection cannot be made.

Callback pattern:

from pyahoo import WebSocket

with WebSocket() as ws:
    ws.subscribe("AAPL", "NVDA", "MSFT")

    @ws.on_tick
    def handle_tick(tick: PricingData):
        print(tick)

    @ws.on_error
    def handle_error(error: Exception):
        print(f"Error: {error}")

    ws.run() # Blocks until ws.stop()

Iterator pattern:

from pyahoo import WebSocket

with WebSocket() as ws:
    ws.subscribe("NVDA", "MSFT", "AAPL")
    
    for tick in ws:
        print(tick)

        if tick.id == "NVDA" and tick.price > 300:
            ws.unsubscribe("NVDA")

Background thread:

from pyahoo import WebSocket
import time

ws = WebSocket()
ws.run_in_background() # Returns the daemon thread
ws.wait_until_connected(timeout=10.0) # Wait for connection to succeed (fails fast if unreachable)
ws.subscribe("NVDA", "AAPL")

@ws.on_tick
def print_tick(tick: PricingData):
    print(tick)

time.sleep(30) # Stream for 30 seconds
ws.stop()

Async WebSocket

The AsyncWebSocket class provides native async/await streaming with async for iteration.

import asyncio
from pyahoo import AsyncWebSocket
 
async def main():
    async with AsyncWebSocket() as ws:
        await ws.subscribe("MSFT", "NVDA", "AAPL")
 
        async for tick in ws:
            print(tick)
 
            if tick.id == "NVDA" and tick.price > 300:
                await ws.unsubscribe("NVDA")
 
asyncio.run(main())

AsyncWebSocket accepts both def and async def callbacks interchangeably.

async with AsyncWebSocket() as ws:
    await ws.subscribe("AAPL", "NVDA", "MSFT")
 
    @ws.on_tick(symbols="MSFT")
    def handle_sync(tick: PricingData):
        print(tick)
 
    @ws.on_tick(symbols=["AAPL", "NVDA"])
    async def handle_async(tick: PricingData):
        print(tick)
 
    @ws.on_connect
    async def on_connect():
        print("Connected!")
 
    @ws.on_disconnect
    def on_disconnect():
        print("Disconnected!")

Event Callbacks

The following callback features are supported identically by both WebSocket and AsyncWebSocket.

Lifecycle events: Register handlers for connection lifecycle events. Keep in mind that due to auto-reconnection logic, these events may be triggered multiple times during the lifespan of the WebSocket client.

with WebSocket() as ws:
    
    @ws.on_connect
    def on_connect():
        print("Connected!")
 
    @ws.on_disconnect
    def on_disconnect():
        print("Disconnected!")

Filtered symbol callbacks: Register handlers for specific symbols only.

with WebSocket() as ws:
    ws.subscribe("AAPL", "NVDA", "MSFT")

    @ws.on_tick(symbols="AAPL")
    def handle_aapl(tick: PricingData):
        print(tick)

    @ws.on_tick(symbols=["NVDA", "MSFT"])
    def handle_nvda_msft(tick: PricingData):
        print(tick)

Filtered error callbacks: Route different exception types to separate handlers. Pass a class, a list of classes, a class name as a string, or a list of strings.

with WebSocket() as ws:
    ws.subscribe("AAPL")
 
    @ws.on_error(exceptions=ConnectionError)
    def handle_connection(error):
        print(f"Connection issue: {error}")
 
    @ws.on_error(exceptions=["StreamingError", "TimeoutError"])
    def handle_specific(error):
        print(f"Specific error: {error}")
 
    @ws.on_error
    def catch_all(error):
        print(f"Unhandled error: {error}")

Direct registration: Callbacks can also be registered directly without decorators, which is useful for dynamic or programmatic handler assignment.

with WebSocket() as ws:
    ws.subscribe("AAPL")
 
    def handle_aapl(tick):
        print(tick)
 
    ws.on_tick(handle_aapl, symbols="AAPL")
    ws.on_connect(lambda: print("Connected!"))

Bound Methods

Every response object holds a reference to the authenticated client that fetched it, so you can chain further queries directly without re-passing symbols.

from pyahoo import YahooFinance

with YahooFinance() as yf:
    nvda = yf.quote("NVDA")

    # These all use the same authenticated session
    chart = nvda.chart()
    summary = nvda.summary()
    options = nvda.options()
    recs = nvda.recommendations()
    insights = nvda.insights()

For asynchronous environments, bound methods are prefixed with async_ to distinguish them from standard synchronous functions.

from pyahoo import AsyncYahooFinance

async def main():
    async with AsyncYahooFinance() as yf:
        nvda = await yf.quote("NVDA")
        chart = await nvda.async_chart()

Enums

Pyahoo provides fully typed Enum classes for nearly all API options, including intervals, ranges, query operators, and modules. Using these enums reduces the chance of sending invalid strings to the API, secures your types, and makes discovery via autocomplete much simpler.

from pyahoo.enums import Interval, Range, ScreenerId, QuoteSummaryModule
# or
from pyahoo import Interval, Range, ScreenerId, QuoteSummaryModule, YahooFinance

with YahooFinance() as yf:
    # Chart with enum values
    yf.chart("AAPL", range=Range.Y1, interval=Interval.D1)

    # Screener with enum
    result = yf.screener(ScreenerId.MOST_ACTIVES, count=10)

    # Quote summary modules
    summary = yf.quote_summary("AAPL",
        QuoteSummaryModule.FINANCIAL_DATA,
        QuoteSummaryModule.EARNINGS,
        QuoteSummaryModule.RECOMMENDATION_TREND,
    )

Common Patterns

Batch Quotes

Fetching multiple symbols at once is usually more efficent than firing individual requests. The quote() method perfectly handles variadic arguments for fetching numerous quotes in a single network round-trip.

symbols = ["AAPL", "NVDA", "MSFT", "GOOGL", "AMZN", "META", "TSLA"]
quotes = yf.quote(*symbols)
for q in quotes:
    print(q) # Human-friendly output

Custom Data Range Chart

While predefined text ranges (like "1mo") are common, you can also fetch precision data using Unix timestamps. The chart() method supports explicit period1 (start) and period2 (end) parameters for exact date windowing alongside events like dividends and splits.

import time

period1 = int(time.time()) - 86400 * 365 # 1 year ago
period2 = int(time.time())
chart = yf.chart("AAPL", period1=period1, period2=period2, interval="1d",
                  events=["div", "split"])
print(f"Bars: {len(chart)}, Dividends: {len(chart.dividends)}")

Screening Stocks

Yahoo Finance relies heavily on predefined screeners (day gainers, most active, etc.). The screener() function fetches formatted screener results. You can control the subset size by supplying the count attribute to limit the data.

from pyahoo import ScreenerId

gainers = yf.screener(ScreenerId.DAY_GAINERS, count=10)
for q in gainers: # ScreenerResult is iterable
    print(q)

Full Financial Profile

For deep fundamental analysis, fetching a company's financial profile usually means calling different modules. The quote_summary() tool helps you query multiple specialized modules (like financial data, earnings, key statistics) in one request payload.

from pyahoo import QuoteSummaryModule as M

summary = yf.quote_summary("AAPL",
    M.FINANCIAL_DATA, M.SUMMARY_PROFILE, M.DEFAULT_KEY_STATISTICS,
    M.EARNINGS, M.RECOMMENDATION_TREND, M.INSTITUTION_OWNERSHIP
)
print(summary) # "QuoteSummary(AAPL: 6 modules [financialData, ...])"

fd = summary.financial_data
if fd:
    print(f"Revenue: {fd['totalRevenue']}")
    print(f"Profit margin: {fd['profitMargins']}")

Options Chain

Priced variations in options markets require structured datasets. Pyahoo transforms the nested options chain JSON into clean list comprehensions. The returned OptionsChain organizes all expirations, calls, and puts into directly accessible attributes.

chain = yf.options("AAPL")
print(chain) # "OptionsChain(AAPL $198.50 45 calls, 42 puts, 12 expirations)"

# Collection protocol
print(f"Total contracts: {len(chain)}")

# ITM calls
itm = [c for c in chain.calls if c.in_the_money]
for c in itm:
    print(c) # "AAPL240419C00150000 strike=$150.00 bid=$48.50 ask=$49.00 ITM"

Error Handling

All exceptions carry a status_code attribute for programmatic inspection:

from pyahoo.exceptions import AuthenticationError, APIError, RateLimitError, NotFoundError
# or
from pyahoo import AuthenticationError, APIError, RateLimitError, NotFoundError, YahooFinance

with YahooFinance() as yf:
    try:
        aapl = yf.quote("AAPL")
    except AuthenticationError:
        print("Session auth failed")
    except NotFoundError as e:
        print(f"Not found (status={e.status_code})")
    except RateLimitError as e:
        print(f"Rate limited (status={e.status_code}) — wait and retry")
    except APIError as e:
        print(f"API error: {e}") # "[HTTP 400] Yahoo Finance API error: ..."
        print(f"Status: {e.status_code}")
        print(f"Body: {e.response_body}")

License

This project is licensed under the MIT License. See the LICENSE file for details.