nuff

https://github.com/aiez/nuff

nuff: one tiny file of reusable Python tricks — attribute-dicts, typed CSV, pretty-print, seeded randomness, non-parametric stats, minimal column summaries, and row distances. Pure stdlib, zero dependencies. The cut-down kernel under my bigger apps, with no global config: every parameter (p, cliff, conf, rng) is passed as a keyword, so any function lifts out into another project.

from nuff import o, csv, say, Data, disty, same, shuffle
import random

d = Data(csv("../optimiz/auto93.csv"))         # build a table
say(disty(d, d.rows[0], p=1))                  # row->goal distance, p a kwarg
shuffle(rows, rng=random.Random(1))            # repeatable, own RNG
same(a, b, cliff=0.195, conf=1.36)             # are two samples the same?

Files: nuff.py | test_nuff.py | Makefile | pyproject.toml

NAME

nuff - one file of tiny stdlib python tricks (no global config)

DESIGN

One file, themed sections, no module-level `the`. Tuning rides
along as keyword args, so any function drops into another app:
  disty(data, row, p=2)
  same(xs, ys, cliff=0.195, conf=1.36)
  shuffle(lst, rng=random.Random(seed))

API

records / io / format
  o              SimpleNamespace alias: o(x=1).x == 1
  thing(s)       coerce str -> int|float|bool|str
  settings(s)    every var=val in s -> an o (vals coerced)
  csv(file)      yield typed rows ('#' = comment)
  say(x, dec=2)  pretty str; whole floats as ints
  sho(rows,just) align list[list[str]] to cols; just='>'/'<'
  main(funs)     run funs[name] for each --name in argv

rand   (pass your own random.Random(seed) for repeatability)
  shuffle(lst, rng)     shuffled copy
  some(lst, k, rng)     sample without replacement
  one(lst, rng)         one random item

stats  (non-parametric "are these two the same?")
  cliffs(xs, ys)        Cliff's delta effect size 0..1
  ks(xs, ys)            Kolmogorov-Smirnov CDF gap
  same(xs, ys, cliff=.195, conf=1.36)
  top_tier(groups, ...) names tied for best (min median)

columns  (lightweight: Sym = dict {value:count}, Num = (n,mu,m2);
          dispatch by isa(col, Sym), no tags)
  Sym() Num(n,mu,m2)       a dict / a 3-tuple; n_/mu_/m2_ read it
  welford(num,v,inc=1)     Num + v (inc=-1 removes) -> new Num
  mix(i,j,inc=1)           merge two same-type cols (inc=-1 = out)
  mid(col) spread(col)     mode/mean, entropy/stdev
  norm(num,v)              0..1 via a logistic on v's z-score (Num)

table  (immutable cols -> add RETURNS the new it, never mutates)
  add(it,v,inc=1)          fold v into a Sym/Num, or a row into a
                           Data; returns it (inc=-1 subtracts)
  adds(src,it=None)        fold all of src into it (default a Num)
  Data(rows)               -> o(names, cols{at:col}, x, y, goal,
                           klass, rows); first row = the header.
                           Upper=Num lower=Sym; +/-/! = goal y;
                           + maximizes; ! = klass; X = skip
  clone(data, src=None)    new Data, same columns, fresh rows

distance  (exponent `p` is a keyword)
  minkowski(vals, p=2)
  disty(data, row, p=2)      distance to best goals (0=ideal)
  distx(data, r1, r2, p=2)   distance between two rows on x
  gap(col, u, v)             per-column value distance 0..1

bayes  (naive bayes; m, k carried as kwargs, no global the)
  like(col, v, n=0, prior=0, k=1)       how a column likes a value
                                        (n = #rows, for the Sym case)
  likes(data, row, nrows, nklasses)     log-likelihood of a row
  confuse(pairs)                        (want,got) -> per-class
                                        o(pd, pf, prec, acc)

tree  (min-variance binary tree; exact cuts, no binning)
  tree(data, leaf=3, maxDepth=12)       build; y defaults to disty
  treeCuts(data, rows, y)               yield candidate cuts
  treeCut(data, rows, y, leaf=3)        the best cut (or None)
  treePredict(t, row)                   leaf disty mean for a row
  treeShow(data, t)                     table: +/- d2h n ymeans tree

STYLE

Minimal python: one file, one-line comments, ~65-char lines,
very short functions, `i` (not self), records over classes.
Threshold for a new file vs a new gist: parts you *import* stay
in here; *wholes* you *run* (apps, other languages, data) get
their own gist.

LICENSE

MIT. https://choosealicense.com/licenses/mit/

AUTHOR

Tim Menzies <timm@ieee.org>

built by gistsite