pyrcv package

Submodules

pyrcv.cli module

Script to tabulate single-transferable vote results from a csv file.

The format required is parsed in the function pyrcv.transform.parse_google_form_csv().

pyrcv.pyrcv module

Implementation of Single Transferable Vote.

class pyrcv.pyrcv.RoundMode(value, names=None, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]

Bases: Enum

How to round a fractional vote threshold to win an election.

\[votes\_needed = num\_votes / (num\_candidates + 1)\]

CEILING = 1

If threshold is a fraction, use the next highest integer.

Used in test of FairVote data, though technically incorrect when result is not a fraction.

ADD_ONE_FLOOR = 2

Add one. If threshold is a fraction, use the next lowest integer.

Easy to understand, as it maintains an integer threshold. More correct than CEILING, as it avoid ties when the result is an integer before rounding. For example, if there are 100 votes, the threshold to win should be 51 votes, not 50.

FRACTIONAL = 3

No rounding of fractional threshold, requires \(threshold + epsilon\) votes.

Technically, this is the most correct, but leads to lots of fractional votes.

pyrcv.pyrcv.run_rcv(race_data: RaceData, round_mode: RoundMode = RoundMode.ADD_ONE_FLOOR) → RaceResult[source]

Run the ranked choice voting algorithm for a single election.

RCV is a method of voting in which each voter casts a single vote in the form of a ranked list. Winners are determined via an iterative algorithm that requires a winner to surpass a threshold (e.g. half the votes in a single-winner election). If no-one surpasses the threshold in an iteration, the candidate with the least votes is eliminated and ballots supporting the eliminated candidate are adjusted to the next highest ranked candidate.

Parameters:

race_data – Full information about the race parameters and votes.
round_mode – The method for rounding the vote threshold.

Raises:

ValueError – Raised when round_mode has an unknown value, or when the ballot data has the wrong shape or contains bad values.
PyRcvError – Raised if an error is detected in the calcuations.

Returns:

RCV results (winners, losers, vote transfers) for each round in the election.

pyrcv.pyrcv.validate_and_standardize_ballots(ballots: list[list[int]], num_cands: int) → ndarray[source]

Flush out ballots into a 2-d array using zeros, and check for invalid votes.

Parameters:

ballots – List of ballots, which are lists of ints
num_cands – Number of candidates in the election.

Returns:

2-d array of ballot data

pyrcv.pyrcv.check_oob(ballots: ArrayLike, num_cands: int)[source]

Checks that ballot indices are positive and <=num_cands.

Parameters:

ballots – List of ballots, which are lists of ints
num_cands – Number of candidates in the election.

Raises:

ValueError – If invalid indices are present, their positions are given in the error message.

pyrcv.pyrcv.check_dup_1d(ballot: ArrayLike) → bool[source]

Checks if there are duplicate candidate indicies in a ballot.

Note that zero constitutes no mark, and duplicate zeros are allowed.

Parameters:: ballot – Ranking of candidate indices, e.g. [4,3,1,0,0]
Raises:: ValueError – If duplicate candidate indices present in ballot.
Returns:: True if duplicate candidate indices present in ballot, False otherwise.

pyrcv.transform module

Utilities to convert a csv format from Google Forms into pyrcv standard format.

pyrcv.transform.parse_google_form_csv(buffer: str | PathLike[str] | ReadCsvBuffer[bytes] | ReadCsvBuffer[str]) → list[RaceData][source]

Parses race and ballot info from Googe Form CSV results file.

The required format is one header line, followed by one line for each ballot. The column headers are parsed to determine the races and candidates. For example, the following represents two races, one with 3 candidates and one with 2 candidates:

Mayor [Abe], Mayor [Betty], Mayor [Chris], Police Chief [Alice], Police Chief [Bob]

Each ballot should provide a numerical ranking within each race. A ballot cannot contain duplicate values within a single race. Using the example above, some valid ballots are:

1, 2, 1, 2, 3 # Can use raw numbers 1st, 2nd, 1st, 2nd, 3rd # Can use ordinals 1st, 2nd,,,, 1st # Does not rank Chris or Alice for Police Chief. 2nd,, 1st, 2nd, 3rd # Gap in ranking Mayoral race. Gaps are OK.

An example invalid ballot would be: 1, 1, 2, 1, 3 # Duplicate ranking in Mayor race.

Parameters:: buffer – CSV-parseable data in the format described above.
Returns:: List containing an entry for each race parsed from the CSV file.

pyrcv.transform.parse_header(header: list[str]) → list[Tuple[RaceMetadata, slice]][source]

Parse header row list into metadata and a column slices for each race.

The Google Form header pattern is the race, followed by one of the options for that race in brackets. Adjacent columns for the same race have the same text, but different options. There also can be an optional parenthetical indicating the number of winners is allowed between the race and the option; if this parenthetical is missing, the race is assumed to be single-winner.

param header:

Header row from CSV file

return:

List of tuples, one for each race. The tuple contains RaceMetadata and a slice indicating the columns corresponding to the options for the race.

Examples header values:

What is your favorite season? [Spring]

City Council (4 winners) [Darth Vader]

Mayor (1 winner) [Luke Skywalker]

pyrcv.types module

Common types used in pyrcv.

exception pyrcv.types.PyRcvError[source]

Bases: Exception

Error in pyrcv.

class pyrcv.types.RaceMetadata(race_name: str, num_winners: int, names: list[str])[source]

Bases: object

Specification of a single race.

Parameters:

race_name – Unique name for this race.
num_winners – How many candidates will win the race.
names – List of candidate names.

race_name: str

num_winners: int

names: list[str]

class pyrcv.types.RaceData(metadata: RaceMetadata, ballots: list[list[int]], votes: list[int])[source]

Bases: object

Voting data for a single race.

The two main ways to use this class:

Each ballot corresponds to one person’s vote. Multiple ballots can be identical, and all the entries in votes are 1. This usage is the most verbose and uses more memory, but does correspond to the typical understanding of a ballot.
Each ballot corresponds to a unique ordering of the candidates. All ballots are unique. The entry in votes corresponding to a given ballot indicates the number of people who voted in that ordering. This usage is the most compact and uses less memory.

As an example, the following would be identical in a 2 candidate race with 7 voters:

metadata: <elided>
ballots: [[1,2], [2], [2,1], [2,1], [2], [1], [2, 1]]
votes: [1, 1, 1, 1, 1, 1, 1]

metadata: <elided>
ballots: [1], [2], [1, 2], [2, 1]
votes: [1, 2, 1, 3]

Parameters:

metadata – Details about the race.
ballots – A list of candidate rankings. Each list[int] is an ordering of a subset candidate indexes, with the index refering to the list of candidates in metadata.
votes – A list of the same length as ballots which denotes the number of votes corresponding to each candidate ranking. The sum of votes corresponds to the total number of votes cast.

metadata: RaceMetadata

ballots: list[list[int]]

votes: list[int]

class pyrcv.types.RoundResult(count: list[float], elected: list[int], eliminated: list[int], transfers: dict[int, dict[int, float]])[source]

Bases: object

The full results of a single-transferable voting round.

Parameters:

count – The votes for each candidate. Candidates are index starting at 1, with the index=0 reserved for exhausted ballots.
elected – Candidate indices that won the election by this round.
eliminated – Candidate indices that lost the election by this round.
transfers – Vote counts transferred during this round. It is a two-level map of src_cand_index -> tgt_cand_index -> vote_count

count: list[float]

elected: list[int]

eliminated: list[int]

transfers: dict[int, dict[int, float]]

class pyrcv.types.RaceResult(metadata: RaceMetadata, rounds: list[RoundResult])[source]

Bases: object

The list of all round results for a single race.

Parameters:

metadata – Details about the race.
rounds – List of results from each round of tabulation.

metadata: RaceMetadata

rounds: list[RoundResult]

pyrcv.viz module

Utilities for creating Sankey plots of RCV results.

pyrcv.viz.NODE_PALETTE = ['rgb(27,158,119)', 'rgb(217,95,2)', 'rgb(117,112,179)', 'rgb(231,41,138)', 'rgb(102,166,30)', 'rgb(230,171,2)', 'rgb(166,118,29)', 'rgb(102,102,102)']: Default Sankey diagram node colors.

pyrcv.viz.LINK_PALETTE = ['rgb(102,194,165)', 'rgb(252,141,98)', 'rgb(141,160,203)', 'rgb(231,138,195)', 'rgb(166,216,84)', 'rgb(255,217,47)', 'rgb(229,196,148)', 'rgb(179,179,179)']: Default Sankey diagram link colors.

pyrcv.viz.result_to_sankey_data(race_result: RaceResult, hide_exhausted=False, node_palette=NODE_PALETTE, link_palette=LINK_PALETTE) → Tuple[DataFrame, DataFrame][source]

Convert a pyrcv.types.RaceResult into data needed for a Sankey diagram.

A Sankey diagram consists of nodes and links between nodes. Links have a value denoting the flow from the source node to the target node. The value of a node is the sum of flows into the node minus the flows out of the node.

In a Sankey diagram visualizing an RCV election, there is a node for each candidate in each round. Nodes are associated with a round, have a color, and are labelled by their candidate. Node indexes for a given round/candidate are show in the following table:

        cand_0   cand_1  ...   cand_n-1
        ------  -------       ---------
round_0:     0,       1, ...,   ncand-1
round_1: ncand, ncand+1, ..., 2*ncand-1
... etc ...

Links are between nodes in consecutive rounds, and denote how many votes flow from one candidate to another - including votes a candidate keeps for themself. A link has a source and a target. By default, it is colored with the same hue as the node it starts from, but with lighter saturation.

Parameters:

race_result – Result of a single RCV race.
hide_exhausted – If True, do not include counts of exhausted ballots, Defaults to False.
node_palette – Colors to use for nodes in Sankey diagram. Should match in hue to link_palette. If there are more candidates then colors, the palette is cycled. Defaults to NODE_PALETTE.
link_palette – Colors to use for links between nodes in Sankey diagram. Should match in hue to node_palette. If there are more candidates then colors, the palette is cycled. Defaults to LINK_PALETTE.

Returns:

Returns two dataframes:

Node dataframe with columns round, label, color
Link dataframe with columns source, target, value, color

pyrcv.viz.create_sankey_fig(df_nodes: DataFrame, df_links: DataFrame) → Figure[source]

Creates Sankey diagram using data about colors, labels, and vote transfers.

See result_to_sankey_data() for a description of the data.

Parameters:

df_nodes – Dataframe containing columns of node info: round, label, color
df_links – Dataframe containing columns of link info: source, target, value, color

Returns:

Sankey diagram

Module contents

Top-level package for pyrcv.