fft

https://github.com/aiez/fft

A fast-and-frugal tree (FFT) makes urgent binary choices via a few yes/no questions, deliberately ignoring most of the data: fast since little computation, frugal since it often stops after one or two cues. Here: fft.py is the core (one pass = 16 candidate trees, one per bias string, from min-variance cuts on incremental Welford stats; ~240 lines, pure Python, no deps); eval.py samples many such trees and picks winners cheaply (successive halving, Hoeffding racing), scoring accuracy plus fairness. Data is CSV; column-name suffix tags type and goal.

# install and test
git clone https://github.com/aiez/optimiz && git clone https://github.com/aiez/fft fft
cd fft && python3 -B fft.py -f ../optimiz/auto93.csv

Files: fft.py | eval.py

NAME

fft - fast-frugal multi-objective trees; eval - sample + race them

SYNOPSIS

python3 -B fft.py  [-flag VAL]... [--grows|--trees]
python3 -B eval.py [-flag VAL]... [--sha|--csv|--final]

OPTIONS

fft.py (defaults from its docstring; SSOT):

  -s seed     random seed              (1234567891)
  -p dist     distance exponent        (2)
  -b bins     numeric bin count        (7)
  -d depth    max tree depth           (4)
  -R Round    display decimals         (2)
  -F fan      cue fan: ifan | dfan     (ifan)
  -f file     data file                ($DOOT/optimiz/auto93.csv)

eval.py:

  -f file     data file                ($DOOT/fairnez/adult.csv)
  -t trainN   train rows per sample    (200; 100 in --sha/--final)
  -P pos      positive label           (auto: labels.py, else minority)
  -r repeats  resamples of trainN rows (10)
  -S start    first rung, rows         (50; --sha/--final only)

DATA

CSV with header row. Each column name encodes type + role
via first char (case) and last char (suffix):

  first char UPPER  -> numeric (Num)
  first char lower  -> symbolic (Sym)
  suffix '+'        -> numeric goal, maximize
  suffix '-'        -> numeric goal, minimize
  suffix '!'        -> symbolic goal (klass)
  suffix '~'        -> protected attribute (fairness)
  suffix 'X'        -> ignore
  else              -> predictor

Missing values: '?'. Example header:

  Clndrs,Volume,HpX,Model,origin,Lbs-,Acc+,Mpg+

TESTS

fft.py (multi-goal regression to distance-to-heaven):

  (none)      grow 16 trees, show the lowest-error one
  --trees     show all 16 candidate trees + their bias + error
  --grows     timing: trees/second over repeated samples

eval.py (binary classification + fairness, e.g. fairnez CSVs):

  (none)      table: every tree x (pd, pf, prec, spd...) by disty
  --sha       full grid vs successive-halving vs Hoeffding race
  --csv       same comparison, one CSV row (for make ens)
  --final     race on train, report winner on test (make best)

TREE OUTPUT

fft.py prints one decision spine: each line is a cue; rows
matching it stop there (their distance-to-heaven d2h, 0=best),
all others fall through to the next cue.

  if Volume <= 83        then d2h 0.28 n=18
  if Model <= 74         then d2h 0.65 n=140
  ...
                         leaf d2h 0.66 n=99

Example:

  python3 -B fft.py -f ../optimiz/config_SS-N.csv --trees

EXIT

0  success
1  bad file / flag

LICENSE

MIT.  https://choosealicense.com/licenses/mit/
(c) 2025 Tim Menzies.

AUTHOR

Tim Menzies <timm@ieee.org>

built by gistsite