Changelog

Unreleased (YYYY-MM-DD)

Changes

Add select_consensus_statements() function, and wire into Polis implementation.
Allow calculate_comment_statistics() to work without groups/labels.
Generalize format_comment_stats() to work for group and consensus statements.
Add select_representative_statements() to PolisClusteringResult as repness key.
Rename arg pick_n to pick_max in select_consensus_statements(), for clarity and consistency.
Slight change to PolisRepness type, so group IDs now returned as ints.
Add print_selected_statements() presenter for inspecting PolisClusteringResult.
Add print_consensus_statements() presenter for inspecting PolisClusteringResult.
Allow pick_max and confidence interval args to be set in polis.run_clustering().
Allow get_corrected_centroid_guesses() to unflip each axis if correction not needed.
Abstracted reducer and clusterer algorithm support.
Added support for pacmap/localmap beyond PCA.
Added support for HDBSCAN clustering beyond KMeans.
Allow passing of arbitary params into reducer/clusterer.
Remove support for polis_legacy implementation (PolisClient).
Added disagree variant of group-informed-consensus. (group-informed-consensus-disagree)
Brought group-informed-consensus metrics to top-level result object.
Renamed run_clustering function to run_pipeline and created base pipeline implementation.
Add option to generate_figure_polis to configure showing pid labels (show_pids).
Remove deprecated methods from doc website.
Remove deprecated modules from prior import paths.
Avoid using dataframes in a few low level util function, in favour of numpy arrays.
Rename projected_{participants,statements} to {participant,statement}_projections in run_pipeline results. Also coords keyed to ID, instead of dataframes.
Remove agora implementation and tests. (#73)

Fixes

Handle when is-meta and is-seed columns arrive in CSV import. #55
Handle loading comments data from API when is_meta missing in CSV import.
Only pass unique labels into generate_figure() colorbar.
bugfix: clusterer_kwargs and reducer_kwargs were not being pass through run_pipeline().
bugfix: Ensure run_pipeline() passes random_state to reducer.

Chores

Update the release process instructions.
Added simulate_api_response() test helper for easier comparison with polismath output.

0.3.0 (2025-04-29)

Fixes

Allow is_strict_moderation to be inferred from not just API data, but file data.
Better handle numpy divide-by-zero edge-cases in two-property test. (#28)
Fix bug where vote_matrix was modified directly, leading to subtle side-effects.
Fix bug in select_representative_statements() where mod-out statements weren't ignored.

Changes

Fixed participant projections to map more closely to Polis with utils.pca.sparsity_aware_project_ptpt().
Add simple Polis implementation in reddwarf.implementations.polis.
Add singular polis_id arg as recommended way to download (auto-detect report_id vs converation_id).
Calculate group-aware consensus stats. (#28)
Removed scale_projected_data() in PolisClient (now happens in run_pca()).
Deprecate PolisClient().
Add inverse_transform() to SparsityAwareScaler.
Add data loader support for local math data files.
Add support to easily flip signs in generate_figure().
Modify generate_figure() to accept more effective args.
Use numpy args of coord_data, coord_labels and cluster_labels individually, rather than using DataFrames.
Allow passing extra coord_data beyond what's labelled.
Add automatic padding to polis implementation when cluster centroid guesses are provided.
Add PolisKMeans scikit-learn estimator with:
cluster initialization strategy matching Polis,
new init_centers argument with more versatility for being given more/less guesses than needed, and
new instance variable init_centers_used_ to allow inspection of guesses used.
Allow passing KMeans init strategy into find_optimal_k().
Remove pad_centroid_list_to_length helper function.
Add GridSearchNonCV to find optimal K via silhouette scores.
For interal util functions, replace max_group_count args with k_bounds for upper and lower k bounds.
Add PolisKMeansDownsampler transformer to support base clustering.
Update get_corrected_centroid_guesses() to also extract from base clusters.
Remove extraneous return values from PolisClusteringResult.
Add data_presenter.generate_figure_polis() for making graphs from PolisClusteringResult.
Add group_aware_consensus dataframe to PolisClusteringResult of polis implementation.
Add group statement stats to MultiIndex DataFrame.
Add reddwarf.data_presenter.print_repress() for printing representative statements.
Add support for Loader() importing data from alternative Polis instances via polis_instance_url arg.
Patch sklearn with a simple PatchedPipeline, to allow pipeline steps to access other steps.
Modify SparsityAwareScaler to be able to use captured output from SparsityAware Capture.
Remove ported Polis PCA functions that are no longer used.
Remove old impute_missing_votes() function that's no longer used.
In PolisClusteringResult, created new statements_df and participants_df with all raw calculation values.

Chores

Moved agora implementation from reddwarf.agora to reddwarf.implementations.agora (deprecation warning).
Add missing conversation.json fixture file.
Extract statement processing from polis class-based client to pure util function.
Add types to fully describe polismath object. (#28)
Add new fixture for large convo without meta statements. (#28)
Add ability to filter unit tests and avoid running whole suite. (#44)
Improve test fixture to download remote Polis data.
Add helper to support simple sign-flips in Polis test data.
Remove usage of PolisClient in tests, in favour of [data] Loader.
Start storing keep_participant_ids in fixtures.
Add solid unit test for expected variance, which is stablest measure we can derive.
Use dataclasses for polis_convo_data test fixture.
Add utils.polismath.get_corrected_centroid_guesses() to initiate centroid guesses from Polis API.
Remove unused init_cluster() helper.

0.2.0 (2025-03-24)

Fixed

Relax seaborn version constraint to be compatible with TabPFN. (#16)
Data loader was not downloading last participant's votes, so most PCA results slightly off. (#29)

Changes

Implement utils.calculate_representativeness() function. (#22)
Add color legend for labels in data_presenter.generate_figure(). d55f535 (#22)
Implement calculations of all comment statistics. (#25)
Implement utils.select_representative_statements() to reproduce polismath output. (#25)
Migrate from red-dwarf-democracy PyPI project namespace to red-dwarf.

Chores

Restructure utils.py into separate files. (#26)
Add unit tests for utils.run_pca() to test against real polismath data.
Add unit tests for agora.run_clustering().
Parametrize unit tests for real polis convo data.
Add testing for notebook examples. (#34)

0.1.1 (2025-03-04)

Bugfixes

Fix publishing issue with missing license file. (#19)
Workaround for pypa/setuptools#4769.
Change package name from reddwarf to red-dwarf-democracy.

0.1.0 (2025-03-04)

Add Agora/ZKorum to README as sponsor.
Add low-level stateless functions in reddwarf.utils.
Add first-pass unit test coverage, and CI.
Add mid-level stateless reddwarf.agora implementation.
Add preferred types/modules and function definitions.
Add code coverage reports for unit tests.
Document library usage in Jupyter notebooks.
Create documentation website using mkdocs.
Add make targets for common development tasks, and default help target.
Experimental
Add high-level stateful Polis client class implementation.
Add high-level stateful data loader class.
- Support loading from JSON API, CSV export API, local files, and remote directory.
- Support rate-limiting.
- Support request caching.
- Support bypass of Cloudflare for all API endpoints.
Add high-level stateful data presenter class.
Add integration tests.