Preprocess

Parse and preprocess MTGA data, such as decklists, 17lands data, and scryfall data.

class mtga_ml.preprocess.DecksDataset(df, keys=None)

A class.

class mtga_ml.preprocess.PicksDataset(df, keys=None)

Loads 17lands draft data as a PyTorch Dataset. Each row represents a single draft pick.

Parameters
  • df (DataFrame) – Raw 17lands draft dataset.

  • keys (list[str]) – Keys to collate in __getitem__. May include names of columns in df as well as “pool” and “pack”. If None, uses all valid keys.

card_names

List of all card names that appear in the column names of df.

Type

list[str]

num_cards

Length of card_names.

Type

int

Examples

>>> df = load_17lands_data(
        "/data",
        "DMU",
        "PremierDraft",
        "draft"
    )
>>> keys = ["pool", "pack", "pick"]
>>> draft_dataset = PicksDataset(df, keys)
mtga_ml.preprocess.load_17lands_data(output_dir, mtga_set, mtga_format, dataset_type, nrows=None, chunk_size=8192, force_download=False)

Loads a public dataset from 17lands.

Parameters
  • output_dir (str) – Directory to download the 17lands dataset to.

  • mtga_set (str) – MTGA set identifier, e.g., “DMU”.

  • mtga_format (str) – MTGA format identifier, e.g., “PremierDraft”.

  • dataset_type (str) – 17lands dataset type identifier, e.g., “draft”.

  • nrows (int) – Number of rows to load. If None, loads all rows.

  • chunk_size (int) – Chunk size to use for streaming download of dataset.

  • force_download (bool) – If true, downloads the 17lands dataset to output_dir even if the dataset already exists in that location.

Returns

The 17lands dataset as a Pandas DataFrame.

Examples

>>> df = load_17lands_data(
        "/data",
        "DMU",
        "PremierDraft",
        "draft"
    )