All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- edgar/__init__.py: Added
NullHandlerto theedgarlogger — follows Python library logging best practice so applications control log output. - edgar/session.py: Downgraded per-request URL, parameter, and rate-limit sleep logs from
infotodebug. - edgar/async_session.py: Same
info→debuglog-level fix assession.py. - edgar/parser.py: Downgraded pagination URL and entry-count logs from
infotodebug. - edgar/client.py: Added
logger— logsdebugon init (rate_limit, cache settings). - edgar/cache.py: Added
logger— logsdebugon cache hit, miss, expired, set, and invalidate. - edgar/tickers.py: Added
debuglogging for cache hit/miss and successful resolution;warningon failed ticker/CIK lookup. - edgar/submissions.py: Added
logger— logsdebugon submissions cache hit. - edgar/xbrl.py: Added
logger— logsdebugon company_facts cache hit. - edgar/datasets.py: Added
logger— logsinfoon bulk download start,debugon per-file extraction with row counts. - edgar/search.py: Added
logger— logsdebugwith EFTS search params before request. - edgar/company.py: Added
logger— logsdebugon identifier resolution path (CIK vs ticker). - edgar/async_session.py: Added
logger.error()before eachraise EdgarRequestError, matchingsession.pypattern for consistent error observability.
- tests/test_logging.py: 7 unit tests for logging output (cache hit/miss/set/invalidate, session error, rate-limit sleep, async session error).
- edgar/parser.py: Changed
except KeyErrortoexcept IndexErrorin ticker symbol extraction —values[2]is a list index, not a dict key.
- edgar/utilis.py: Deleted dead duplicate of
utils.py(nothing imported it). - edgar/parser.py: Removed 3 commented-out
print()debug lines.
- edgar/async_client.py:
EdgarAsyncClient— async counterpart ofEdgarClientusinghttpx.- Same API surface with
await:resolve_ticker(),resolve_cik(),get_company_info(),get_filings(),get_facts(),search(),download(). - Async context manager support (
async with EdgarAsyncClient(...) as client). - Enables
asyncio.gather()for concurrent requests in web apps and data pipelines.
- Same API surface with
- edgar/async_session.py:
EdgarAsyncSession— async HTTP session withhttpx.AsyncClient.- Sliding-window rate limiter using
asyncio.sleep(). - Retry logic with exponential backoff on non-200 responses.
make_request(),fetch_page(),download(),close()coroutines.
- Sliding-window rate limiter using
- pyproject.toml: Added
[async]optional dependency group (pip install python-sec[async]).- Requires
httpx>=0.28.
- Requires
- edgar/__init__.py: Exports
EdgarAsyncClientfrom the top-level package. - tests/test_async_client.py: 28 unit tests for async session and client (init, rate limiting, URL building, make_request, download, fetch_page, ticker/CIK resolution, search, context manager).
- samples/use_async_client.py: Sample demonstrating async client usage with concurrent requests.
- edgar/datasets.py: Bulk DERA financial statement dataset download and extraction.
Datasets.get_financial_statements(year, quarter)downloads quarterly ZIP from SEC DERA and returns parsed TSV data asdict[str, list[dict]](keys:sub,num,tag,pre).Datasets.get_financial_statements_dataframes(year, quarter)returns the same data asdict[str, pandas.DataFrame](requirespandasoptional dep).- Internal
_extract_tsv_zip()helper handles ZIP extraction and tab-separated parsing.
- tests/test_datasets.py: 15 unit tests for bulk dataset download, extraction, DataFrame conversion, and error handling.
- samples/use_dataset_service.py: Added bulk financial statements sections demonstrating
get_financial_statements()and DataFrame variant. - edgar/models.py:
to_json()andto_csv()serialization methods on all model classes (Filing,CompanyInfo,Submission,Fact,Facts,SearchResult).result.to_json(path=None, indent=2)serializes to a JSON string; optionally writes to file.result.to_csv(path=None)serializes to a CSV string (header + one row); optionally writes to file.- Module-level
to_json(items, path=None)andto_csv(items, path=None)for serializing lists of models. - List/dict properties are JSON-encoded in CSV cells for lossless round-tripping.
- tests/test_serialization.py: 35 unit tests for JSON/CSV serialization (instance methods, module-level functions, file writing, edge cases).
- samples/use_models.py: Added JSON and CSV serialization sections demonstrating
to_json()andto_csv(). - samples/cookbook_company_research.ipynb: Cookbook notebook — company research workflow (ticker lookup, metadata, filings, XBRL facts, DataFrame export, multi-company comparison).
- samples/cookbook_xbrl_analysis.ipynb: Cookbook notebook — XBRL financial analysis (taxonomy browsing, concept retrieval, unit filtering, time-series DataFrames, frames cross-company comparison).
- samples/cookbook_filing_search.ipynb: Cookbook notebook — filing search & download (full-text search, form/date filters, pagination, document download, save to file).
- samples/cookbook_bulk_pipeline.ipynb: Cookbook notebook — bulk data pipeline (batch processing, multi-company aggregation, SEC datasets, rate limiting, caching, CSV export).
- edgar/cache.py: In-memory TTL cache for SEC EDGAR API responses.
TTLCacheclass withget(),set(),invalidate(),clear(),__len__(),__repr__().- Uses
time.monotonic()for expiration immune to wall-clock adjustments. - Module-level TTL constants:
TTL_TICKERS(24h),TTL_TAXONOMY(24h),TTL_SUBMISSIONS(1h).
- tests/test_cache.py: 29 unit tests for
TTLCache(get/set, expiration, invalidate, clear, len/repr, constants) and cache integration withTickers,Submissions, andXbrlservices. - edgar/__init__.py: Top-level convenience functions for reduced boilerplate.
edgar.company("AAPL")— create aCompanywithout instantiatingEdgarClient.edgar.get_filings("AAPL", form="10-K")— fetch filings in one call.edgar.search("revenue recognition")— full-text search in one call.edgar.set_user_agent()— set user-agent programmatically.SEC_EDGAR_USER_AGENTenvironment variable auto-detected as fallback.- Lazy singleton
EdgarClientcreated on first use and cached.
- tests/test_convenience.py: 17 unit tests for convenience functions (
set_user_agent,_get_client,company,get_filings,search, env var fallback, caching, exports). - samples/use_convenience.py: Sample file demonstrating top-level convenience functions (env var,
set_user_agent,company,get_filings,search). - edgar/models.py:
FactandFactsXBRL dataclass models.Factswraps the deeply nestedcompany_factsJSON (4 levels) withget(taxonomy, concept, unit=None)returning a flatlist[Fact]sorted by end date.Facts.taxonomieslists available namespaces (e.g.['dei', 'us-gaap', 'ifrs-full']).Facts.concepts(taxonomy)lists concept names within a taxonomy.Facts.label(),Facts.description(),Facts.units()for concept metadata.Factwraps a single data point withvalue,end,start,fiscal_year,fiscal_period,form,filed,frameproperties.
- xbrl.py:
get_facts(cik)method returning a structuredFactsmodel. - company.py:
get_facts()method returning a structuredFactsmodel. - tests/test_xbrl_facts.py: 39 unit tests for
Fact,Facts,Company.get_facts(),Xbrl.get_facts(), and taxonomy parameter support. - edgar/models.py:
to_dataframe()standalone function andFacts.to_dataframe()method for pandas integration.to_dataframe(items)converts any list of model objects (Filing,Fact,Submission,SearchResult,CompanyInfo) to apandas.DataFrame.Facts.to_dataframe(taxonomy, concept, unit=None)returns fact data points as a DataFrame.- Graceful
ImportErrorwith message"pip install python-sec[pandas]"when pandas is not installed.
- pyproject.toml:
[pandas]optional dependency group (pandas>=2.0). - tests/test_to_dataframe.py: 20 unit tests for
to_dataframe(),Facts.to_dataframe(), and graceful import error handling. - samples/use_dataframes.py: Sample file demonstrating DataFrame conversion for facts, filings, search results, and submissions.
- edgar/client.py:
EdgarClientnow acceptsrate_limitparameter:EdgarClient(user_agent="...", rate_limit=5).- Defaults to 10 (SEC's maximum). Validates range 1–10.
- Passed through to
EdgarSessionfor sliding-window enforcement.
- edgar/client.py:
EdgarClientnow acceptscacheparameter (bool, defaultTrue).cache=Truecreates a sharedTTLCachepassed toEdgarSession.cache=Falsedisables caching; all requests hit the network.
- edgar/session.py:
EdgarSessionaccepts optionalcacheparameter storing aTTLCacheinstance. - edgar/tickers.py:
Tickers._load()checks/stores data in the TTL cache (TTL_TICKERS). - edgar/submissions.py:
Submissions.get_submissions()checks/stores responses in the TTL cache (TTL_SUBMISSIONS). - edgar/xbrl.py:
Xbrl.company_facts()checks/stores responses in the TTL cache (TTL_TAXONOMY). - tests/test_rate_limiter.py: 8 new tests for configurable rate limit (custom values, boundary validation, client passthrough). Total: 17 tests.
- xbrl.py:
company_concepts()andframes()now accept an optionaltaxonomyparameter (default"us-gaap"). Previously hardcoded tous-gaap, now supports"ifrs-full","dei", or any other taxonomy. - edgar/tickers.py: New
Tickersservice for ticker/CIK/company name resolution viasec.gov/files/company_tickers.json.resolve_ticker("AAPL")→ zero-padded CIK string ("0000320193").resolve_cik(320193)→ list of company entries (ticker, title, CIK).search("Apple")→ case-insensitive fuzzy search across tickers and company names.- Data is fetched once and cached in memory for the session lifetime.
- session.py:
download(url, path=None)method to fetch filing documents (HTML, XML, PDF) from any SEC URL. Auto-detects text vs binary content. Optionalpathparameter saves directly to file. - client.py: Convenience methods
resolve_ticker(),resolve_cik(),tickers(), anddownload()onEdgarClient. - samples/use_tickers_and_download.py: Sample file demonstrating ticker resolution, company search, and filing download.
- tests/test_tickers.py: 14 unit tests for the
Tickersservice (resolve, reverse lookup, search, caching, error handling). - tests/test_download.py: 9 unit tests for
download()(text/binary content, save-to-file, error handling, client delegation). - edgar/company.py: New fluent
Companyclass for ticker-based SEC EDGAR access.client.company("AAPL")resolves ticker or CIK →Companyobject withcik,ticker,nameproperties.company.filings(form="10-K")— fluent chaining to get filings without separate service objects.company.submissions()— fetch full submission history.company.xbrl_facts()— fetch XBRL company facts.company.download(url)— download filing documents.- Accepts both ticker symbols (
"AAPL") and CIK numbers ("320193").
- client.py:
company()method onEdgarClientfor fluent company access. Existingfilings()/companies()methods remain for backward compatibility. - tests/test_company.py: 21 unit tests for the
Companyclass (construction, properties, fluent methods, client integration, backward compat). - edgar/models.py: New structured dataclass response models —
Filing,CompanyInfo,Submission.Filingwraps filing search result dicts withform_type,filing_date,url,accession_number,title,summaryproperties.CompanyInfowraps submissions metadata withname,cik,tickers,sic,sic_description,recent_filings,recent_submissionsproperties.Submissionwraps individual filing records withform,filing_date,accession_number,report_date,is_xbrl,sizeproperties.- All models expose
.rawattribute for backward compatibility with raw dict access. - All models are frozen (immutable) with
__repr__for REPL/notebook discoverability.
- company.py:
get_filings(form=None)→list[Filing]andget_info()→CompanyInfoconvenience methods returning structured models. - tests/test_models.py: 23 unit tests for all three models and the Company integration methods.
- README.md: Complete rewrite with hero example, full service table (15 services), usage examples for ticker resolution, fluent Company API, XBRL, filing search, downloads, response models, and badge row.
- samples/use_company.py: Sample file demonstrating the fluent Company interface (creation by ticker/CIK, filings, submissions, XBRL, download).
- samples/use_models.py: Sample file demonstrating structured dataclass response models (
Filing,CompanyInfo,Submission). - samples/use_xbrl_facts.py: Sample file demonstrating
FactsandFactXBRL dataclass models (taxonomy browsing, concept retrieval, unit filtering, metadata, cross-taxonomy access). - tests/test_rate_limiter.py: 9 unit tests for the sliding-window rate limiter (under-limit, at-limit sleep, timestamp expiry, integration checks for all three request paths).
- edgar/search.py: New
Searchservice wrapping the EDGAR Full-Text Search (EFTS) endpoint atefts.sec.gov/LATEST/search-index.full_text_search(q, form_types=None, start_date=None, end_date=None, start=0, size=100)— query filings by keyword, form type, and date range.
- edgar/models.py:
SearchResultdataclass wrapping EFTS Elasticsearch hit dicts.- Properties:
company_name,cik,form,filing_date,accession_number,file_type,file_description,period_ending,url. - URL constructed from
_idfield ({adsh}:{filename}) pointing to the full filing document on SEC.gov.
- Properties:
- edgar/client.py:
search()convenience method andfull_text_search()service accessor for EFTS search. - edgar/session.py:
build_url()andmake_request()accept optionalbase_urlparameter to support third-party SEC endpoints (e.g.efts.sec.gov). - tests/test_search.py: 35 unit tests for
SearchResultmodel,Searchservice,EdgarClient.search()integration, andbuild_urlbase_url parameter. - samples/use_search.py: Sample file demonstrating full-text search (basic query, form type filtering, date ranges, result properties, pagination).
- edgar/models.py:
_repr_html_()on all six response models for Jupyter/notebook rendering.Filing,CompanyInfo,Submission,Fact,Facts,SearchResultauto-render as styled HTML tables.- Helper functions
_html_kv_table(),_html_row_table(),_esc()for XSS-safe HTML generation. - Inline CSS constants (
_TABLE_STYLE,_TH_STYLE,_TD_STYLE,_CAPTION_STYLE) for consistent styling across Jupyter Lab, Notebook, VS Code, and Colab.
- tests/test_repr_html.py: 35 unit tests for
_repr_html_()on all models (HTML output, key values, XSS escaping, edge cases). - samples/demo_jupyter_rendering.ipynb: Jupyter notebook demonstrating auto-rendering for all model types and DataFrame conversion.
- session.py: Replaced counter-based rate limiter (
sleep 5s every 10 requests) with a sliding-window algorithm usingcollections.dequeoftime.monotonic()timestamps. Sleeps only the minimum time needed when the 1-second window is full.MAX_REQUESTS_PER_SECOND = 10enforced per SEC policy. - session.py: Rate limiting now applies to all three outgoing request paths (
make_request(),fetch_page(),download()). Previouslyfetch_page()anddownload()bypassed rate limiting entirely.
- Migrated from
setup.pytopyproject.tomlfor modern packaging. - Relaxed dependency version pins to use minimum ranges instead of exact versions.
- Updated minimum Python version to 3.9.
- Excluded
samples/andtests/from distributed package.
- enums.py: Renamed
StateCodesmembers from mixed-case (Alabama,New_York) to UPPER_CASE (ALABAMA,NEW_YORK) to follow Python enum naming conventions. - utils.py: Exception chaining —
except ValueError as exc/raise ... from excinparse_dates. - session.py: Replaced infinite retry loop with bounded retry (max 5) and exponential backoff.
- session.py: Broadened exception handling from
HTTPErrortoRequestExceptionto catch connection and timeout errors. - session.py: Reuse a single
requests.Sessionwith connection pooling instead of creating a new session per request. - session.py: Added
application/jsoncontent-type handler so JSON API responses are returned correctly. - session.py: Changed rate-limit check from
== 9to>= RATE_LIMIT_INTERVALto prevent missed triggers. - session.py: Replaced
print()calls withloggingmodule. - session.py: Removed dead code that caused
SyntaxError. - session.py: Now raises
EdgarRequestErrorinstead of rawrequestsexceptions. - parser.py: Replaced
ET.fromstring()withdefusedxml.ElementTree.fromstring()to prevent XXE attacks. - parser.py: Routed all HTTP requests through a shared session with the required SEC
User-Agentheader. - parser.py: Replaced
print()calls withloggingmodule. - parser.py: Now raises
EdgarParseErroron XML parse failures instead of rawET.ParseError. - submissions.py: Added CIK input validation (
isdigit()) to prevent path traversal. - xbrl.py: Added CIK input validation (
isdigit()) incompany_conceptsandcompany_facts. - utilis.py:
parse_datesnow raisesValueErroron invalid date strings instead of silently returningNone. - Service methods: All 26 methods across 7 service files now use
try/finallyto ensure_reset_params()cleanup on errors. - session.py: Typed
clientparameter asEdgarClient(forward reference) instead ofobject;make_requestreturn type nowUnion[dict, str, None]. - 14 service/utility files: Added
from __future__ import annotationsand corrected all return type annotations to match actual return values (list[dict],dict | None, PEP 604 syntax). - parser.py: Added
from __future__ import annotations; replacedList[Dict]/List[str]withlist[dict]/list[str]; fixed_grab_next_pagereturn type toET.Element | None,_parse_issuer_next_buttontostr | None. - archives.py: Removed redundant
.format()call on f-string inget_company_directory. - session.py: Removed unused
pathlib,sysimports and log directory creation +logging.basicConfig()calls (libraries should not configure logging). - 6 service classes: Filled in empty class-level docstrings (
Archives,CurrentEvents,Issuers,Series,OwnershipFilings,VariableInsuranceProducts). - 15 files: Standardized all 120 docstring section headings —
### Arguments:→### Parameters, removed trailing colons from### Returns:,### Usage:,### Overview:. - README.md: Removed duplicate "Setup - PyPi Install" and "Setup - PyPi Upgrade" sections.
- client.py: Service factory methods now cache instances (lazy-init) so repeated calls return the same object.
- All service files: Replaced mutable
self.paramsinstance state with localparamsdicts in every method, removing_reset_params()entirely. Fixes thread-safety and state-leak bugs. - session.py: Now creates shared
EdgarParserandEdgarUtilitiesinstances; all services reference these instead of creating their own. - parser.py: Removed direct HTTP calls (
requests/urllib3imports); pagination now uses afetch_pagecallback injected by the caller. Session providesfetch_page()method. - utilis.py → utils.py: Renamed module from
utilis.pytoutils.py(typo fix); updated import insession.py. - companies.py: Fixed "Comapnies" typo → "Companies" in
__repr__and__init__docstring. - samples/: Added module docstrings to all 13 sample files.
- samples/use_client.py: Fixed "Initalize" typo → "Initialize"; removed unused imports (
pprint,date,FilingTypeCodes). - samples/use_issuers_service.py: Removed unused imports (
StateCodes,CountryCodes,StandardIndustrialClassificationCodes). - samples/use_company_service.py: Sorted imports alphabetically; updated
StateCodes.West_Virginia→StateCodes.WEST_VIRGINIA. - All examples: Updated
EdgarClient()→EdgarClient(user_agent=...)across 13 sample files, README.md, test file, and 55 docstring examples in 14edgar/modules to reflect the requireduser_agentparameter.
edgar/parser/xbrl.py: DeletedXbrlFilingstub class — never imported or referenced.edgar/parser/: Removed empty directory that conflicted withparser.pymodule.edgar/enums.py: Replaced monolithic 1581-line file withedgar/enums/package — one module per enum class (state_codes.py,country_codes.py,filing_type_codes.py,sic_codes.py,other_filing_types.py) plus__init__.pyre-exporting all names. All existingfrom edgar.enums import Ximports continue to work.
py.typedmarker for PEP 561 type checker support.CHANGELOG.mdto track version history..gitignorefile.defusedxml>=0.7.1dependency for safe XML parsing.edgar/exceptions.pymodule withEdgarError,EdgarRequestError,EdgarParseErrorcustom exceptions.edgar/__init__.pywith public API exports (from edgar import EdgarClient).tests/conftest.pywith shared fixtures (edgar_client,edgar_session,edgar_parser,edgar_utilities) and sample data constants (Atom feeds, JSON submissions, XBRL facts, directory listings).tests/test_parser.py— 12 unit tests forEdgarParsercoveringparse_entries, pagination,check_for_next_page, andparse_entry_element.tests/test_services.py— 17 mocked HTTP tests for session URL building,make_request, and service methods (companies, submissions, XBRL, archives, filings).tests/test_utils.py— 13 tests forparse_dates,clean_directories, andclean_filing_directory..github/workflows/ci.yml— GitHub Actions CI workflow testing Python 3.9–3.13 on push/PR to master.- Test class rename:
Edg→TestEdgarClientintest_edgar_client.py.
- Initial public release.
- EDGAR client with services: Archives, Companies, CurrentEvents, Datasets, Filings, Issuers, MutualFunds, OwnershipFilings, Series, Submissions, VariableInsuranceProducts, XBRL.