Skip to content

PFD Toolkit

Changelog

📆 Changelog¶

Welcome to the project changelog. All notable changes to this project will be documented below.

0.3.7 - 2025-09-02¶

In August 2025, the judiciary.uk website made some subtle changes that broke PFD Toolkit's scraper, meaning that we were unable to collect newly published reports. This issue has now been resolved, and all previously missed reports have now been added.

0.3.6 - 2025-08-03¶

Improve reliability and performance of the Scraper and Cleaner modules.
The Cleaner module now standardises each report 'area' to one of 77 official jurisdictions (e.g. "Liverpool and the Wirral"), so minor variations and typos are automatically corrected for consistent regional filtering.
load_reports() now refreshes the dataset by default. Pass refresh=False to use a previously cached copy instead of downloading again.

0.3.5 - 2025-07-07¶

Fixed issue where PFD Toolkit refused to run in Google Colab

0.3.4 - 2025-07-07¶

Deprecated user_query in Screener in favour of search_query. user_query will be removed in a future release.
Dropping spans in extract_features() no longer removes spans added during screening.
Downgraded pandas from 2.3.0 to 2.2.2
Fixed text cleaning bug that expanded dates and removed paragraph spacing.
Added tests covering span removal behaviour.

0.3.3 - 2025-06-25¶

Improved package installation time
Changed default LLM model from GPT-4.1-mini to GPT-4.1

0.3.2 - 2025-06-23¶

You no longer need to manually update the pfd_toolkit package to get access to freshly published reports. Instead, run load_reports(refresh=True).
Improve robustness of Scraping module in handling missing data between different scraping strategies.
Fixed typos and improve documentation.

0.3.1 - 2025-06-19¶

Improved reliability of weekly dataset top-ups.

0.3.0 - 2025-06-18¶

First public release! ✨