red-dwarf
Changelog
Changes
Add select_consensus_statements()
function, and wire into Polis implementation.
Allow calculate_comment_statistics()
to work without groups/labels.
Generalize format_comment_stats()
to work for group and consensus statements.
Add select_representative_statements()
to PolisClusteringResult as repness
key.
Rename arg pick_n
to pick_max
in select_consensus_statements()
, for clarity and consistency.
Slight change to PolisRepness type, so group IDs now returned as ints.
Add print_selected_statements()
presenter for inspecting PolisClusteringResult
.
Add print_consensus_statements()
presenter for inspecting PolisClusteringResult
.
Allow pick_max
and confidence
interval args to be set in polis.run_clustering()
.
Allow get_corrected_centroid_guesses()
to unflip each axis if correction not needed.
Abstracted reducer and clusterer algorithm support.
Added support for pacmap/localmap beyond PCA.
Added support for HDBSCAN clustering beyond KMeans.
Allow passing of arbitary params into reducer/clusterer.
Remove support for polis_legacy
implementation (PolisClient).
Added disagree variant of group-informed-consensus. (group-informed-consensus-disagree
)
Brought group-informed-consensus
metrics to top-level result object.
Renamed run_clustering
function to run_pipeline
and created base pipeline implementation.
Add option to generate_figure_polis to configure showing pid labels (show_pids
).
Remove deprecated methods from doc website.
Remove deprecated modules from prior import paths.
Avoid using dataframes in a few low level util function, in favour of numpy arrays.
Rename projected_{participants,statements}
to {participant,statement}_projections
in run_pipeline results. Also coords keyed to ID, instead of dataframes.
Remove agora implementation and tests. (#73 )
Fixes
Handle when is-meta
and is-seed
columns arrive in CSV import.
#55
Handle loading comments data from API when is_meta
missing in CSV import.
Only pass unique labels into generate_figure()
colorbar.
bugfix: clusterer_kwargs
and reducer_kwargs
were not being pass through run_pipeline()
.
Chores
Update the release process instructions.
Added simulate_api_response()
test helper for easier comparison with polismath output.
0.3.0 (2025-04-29)
Fixes
Allow is_strict_moderation
to be inferred from not just API data, but file data.
Better handle numpy divide-by-zero edge-cases in two-property test. (#28 )
Fix bug where vote_matrix
was modified directly, leading to subtle side-effects.
Fix bug in select_representative_statements()
where mod-out statements weren't ignored.
Changes
Fixed participant projections to map more closely to Polis with utils.pca.sparsity_aware_project_ptpt()
.
Add simple Polis implementation in reddwarf.implementations.polis
.
Add singular polis_id
arg as recommended way to download (auto-detect report_id
vs converation_id
).
Calculate group-aware consensus stats. (#28 )
Removed scale_projected_data()
in PolisClient
(now happens in run_pca()
).
Deprecate PolisClient()
.
Add inverse_transform()
to SparsityAwareScaler
.
Add data loader support for local math data files.
Add support to easily flip signs in generate_figure()
.
Modify generate_figure()
to accept more effective args.
Use numpy args of coord_data
, coord_labels
and cluster_labels
individually, rather than using DataFrames.
Allow passing extra coord_data
beyond what's labelled.
Add automatic padding to polis implementation when cluster centroid guesses are provided.
Add PolisKMeans
scikit-learn estimator with:
cluster initialization strategy matching Polis,
new init_centers
argument with more versatility for being given more/less guesses than needed, and
new instance variable init_centers_used_
to allow inspection of guesses used.
Allow passing KMeans init
strategy into find_optimal_k()
.
Remove pad_centroid_list_to_length
helper function.
Add GridSearchNonCV
to find optimal K via silhouette scores.
For interal util functions, replace max_group_count
args with k_bounds
for upper and lower k bounds.
Add PolisKMeansDownsampler
transformer to support base clustering.
Update get_corrected_centroid_guesses()
to also extract from base clusters.
Remove extraneous return values from PolisClusteringResult
.
Add data_presenter.generate_figure_polis()
for making graphs from PolisClusteringResult.
Add group_aware_consensus
dataframe to PolisClusteringResult of polis implementation.
Add group statement stats to MultiIndex DataFrame.
Add reddwarf.data_presenter.print_repress()
for printing representative statements.
Add support for Loader()
importing data from alternative Polis instances via polis_instance_url
arg.
Patch sklearn with a simple PatchedPipeline
, to allow pipeline steps to access other steps.
Modify SparsityAwareScaler
to be able to use captured output from SparsityAware Capture.
Remove ported Polis PCA functions that are no longer used.
Remove old impute_missing_votes()
function that's no longer used.
In PolisClusteringResult
, created new statements_df
and participants_df
with all raw calculation values.
Chores
Moved agora implementation from reddwarf.agora
to reddwarf.implementations.agora
(deprecation warning).
Add missing conversation.json
fixture file.
Extract statement processing from polis class-based client to pure util function.
Add types to fully describe polismath object. (#28 )
Add new fixture for large convo without meta statements. (#28 )
Add ability to filter unit tests and avoid running whole suite. (#44 )
Improve test fixture to download remote Polis data.
Add helper to support simple sign-flips in Polis test data.
Remove usage of PolisClient in tests, in favour of [data] Loader.
Start storing keep_participant_ids
in fixtures.
Add solid unit test for expected variance, which is stablest measure we can derive.
Use dataclasses for polis_convo_data
test fixture.
Add utils.polismath.get_corrected_centroid_guesses()
to initiate centroid guesses from Polis API.
Remove unused init_cluster()
helper.
0.2.0 (2025-03-24)
Fixed
Relax seaborn version constraint to be compatible with TabPFN. (#16 )
Data loader was not downloading last participant's votes, so most PCA results slightly off. (#29 )
Changes
Implement utils.calculate_representativeness()
function. (#22 )
Add color legend for labels in data_presenter.generate_figure()
. d55f535
(#22 )
Implement calculations of all comment statistics. (#25 )
Implement utils.select_representative_statements()
to reproduce polismath output. (#25 )
Migrate from red-dwarf-democracy
PyPI project namespace to red-dwarf
.
Chores
Restructure utils.py
into separate files. (#26 )
Add unit tests for utils.run_pca()
to test against real polismath data.
Add unit tests for agora.run_clustering()
.
Parametrize unit tests for real polis convo data.
Add testing for notebook examples. (#34 )
0.1.1 (2025-03-04)
Bugfixes
Fix publishing issue with missing license file. (#19 )
Workaround for pypa/setuptools#4769
.
Change package name from reddwarf
to red-dwarf-democracy
.
0.1.0 (2025-03-04)
Add Agora/ZKorum to README as sponsor.
Add low-level stateless functions in reddwarf.utils
.
Add first-pass unit test coverage, and CI.
Add mid-level stateless reddwarf.agora
implementation.
Add preferred types/modules and function definitions.
Add code coverage reports for unit tests.
Document library usage in Jupyter notebooks.
Create documentation website using mkdocs
.
Add make
targets for common development tasks, and default help target.
Experimental
Add high-level stateful Polis client class implementation.
Add high-level stateful data loader class.
Support loading from JSON API, CSV export API, local files, and remote directory.
Support rate-limiting.
Support request caching.
Support bypass of Cloudflare for all API endpoints.
Add high-level stateful data presenter class.
Add integration tests.