π Changelog¶
Welcome to the project changelog. All notable changes to this project will be documented below.
0.3.6 - 2025-08-03¶
- Improve reliability and performance of the Scraper and Cleaner modules.
- The Cleaner module now standardises each report 'area' to one of 77 official jurisdictions (e.g. "Liverpool and the Wirral"), so minor variations and typos are automatically corrected for consistent regional filtering.
load_reports()
now refreshes the dataset by default. Passrefresh=False
to use a previously cached copy instead of downloading again.
0.3.5 - 2025-07-07¶
- Fixed issue where PFD Toolkit refused to run in Google Colab
0.3.4 - 2025-07-07¶
- Deprecated
user_query
inScreener
in favour ofsearch_query
.user_query
will be removed in a future release. - Dropping spans in
extract_features()
no longer removes spans added during screening. - Downgraded pandas from 2.3.0 to 2.2.2
- Fixed text cleaning bug that expanded dates and removed paragraph spacing.
- Added tests covering span removal behaviour.
0.3.3 - 2025-06-25¶
- Improved package installation time
- Changed default LLM model from GPT-4.1-mini to GPT-4.1
0.3.2 - 2025-06-23¶
- You no longer need to manually update the
pfd_toolkit
package to get access to freshly published reports. Instead, runload_reports(refresh=True)
. - Improve robustness of Scraping module in handling missing data between different scraping strategies.
- Fixed typos and improve documentation.
0.3.1 - 2025-06-19¶
- Improved reliability of weekly dataset top-ups.
0.3.0 - 2025-06-18¶
First public release! β¨