http://tiny.cc/nuff
nuff: one tiny file of reusable Python tricks — attribute-dicts,
typed CSV, pretty-print, seeded randomness, non-parametric stats,
minimal column summaries, and row distances. Pure stdlib, zero
dependencies. The cut-down kernel under my bigger apps, with no
global config: every parameter (p,
cliff, conf, rng) is passed as a
keyword, so any function lifts out into another project.
from nuff import o, csv, say, Data, disty, same, shuffle
import random
d = Data(csv("../optimiz/auto93.csv")) # build a table
say(disty(d, d.rows[0], p=1)) # row->goal distance, p a kwarg
shuffle(rows, rng=random.Random(1)) # repeatable, own RNG
same(a, b, cliff=0.195, conf=1.36) # are two samples the same?Sections: NAME | DESIGN | API | STYLE | LICENSE | AUTHOR
Files: nuff.py | test_nuff.py | Makefile | pyproject.toml
NAME
nuff - one file of tiny stdlib python tricks (no global config)
DESIGN
One file, themed sections, no module-level `the`. Tuning rides
along as keyword args, so any function drops into another app:
disty(data, row, p=2)
same(xs, ys, cliff=0.195, conf=1.36)
shuffle(lst, rng=random.Random(seed))
API
records / io / format
o SimpleNamespace alias: o(x=1).x == 1
thing(s) coerce str -> int|float|bool|str
settings(s) every var=val in s -> an o (vals coerced)
csv(file) yield typed rows ('#' = comment)
say(x, dec=2) pretty str; whole floats as ints
sho(rows,just) align list[list[str]] to cols; just='>'/'<'
main(funs) run funs[name] for each --name in argv
rand (pass your own random.Random(seed) for repeatability)
shuffle(lst, rng) shuffled copy
some(lst, k, rng) sample without replacement
one(lst, rng) one random item
stats (non-parametric "are these two the same?")
cliffs(xs, ys) Cliff's delta effect size 0..1
ks(xs, ys) Kolmogorov-Smirnov CDF gap
same(xs, ys, cliff=.195, conf=1.36)
top_tier(groups, ...) names tied for best (min median)
columns (lightweight: Sym = dict {value:count}, Num = (n,mu,m2);
dispatch by isa(col, Sym), no tags)
Sym() Num(n,mu,m2) a dict / a 3-tuple; n_/mu_/m2_ read it
welford(num,v,inc=1) Num + v (inc=-1 removes) -> new Num
mix(i,j,inc=1) merge two same-type cols (inc=-1 = out)
mid(col) spread(col) mode/mean, entropy/stdev
norm(num,v) 0..1 via a logistic on v's z-score (Num)
table (immutable cols -> add RETURNS the new it, never mutates)
add(it,v,inc=1) fold v into a Sym/Num, or a row into a
Data; returns it (inc=-1 subtracts)
adds(src,it=None) fold all of src into it (default a Num)
Data(rows) -> o(names, cols{at:col}, x, y, goal,
klass, rows); first row = the header.
Upper=Num lower=Sym; +/-/! = goal y;
+ maximizes; ! = klass; X = skip
clone(data, src=None) new Data, same columns, fresh rows
distance (exponent `p` is a keyword)
minkowski(vals, p=2)
disty(data, row, p=2) distance to best goals (0=ideal)
distx(data, r1, r2, p=2) distance between two rows on x
gap(col, u, v) per-column value distance 0..1
bayes (naive bayes; m, k carried as kwargs, no global the)
like(col, v, n=0, prior=0, k=1) how a column likes a value
(n = #rows, for the Sym case)
likes(data, row, nrows, nklasses) log-likelihood of a row
confuse(pairs) (want,got) -> per-class
o(pd, pf, prec, acc)
tree (min-variance binary tree; exact cuts, no binning)
tree(data, leaf=3, maxDepth=12) build; y defaults to disty
treeCuts(data, rows, y) yield candidate cuts
treeCut(data, rows, y, leaf=3) the best cut (or None)
treePredict(t, row) leaf disty mean for a row
treeShow(data, t) table: +/- d2h n ymeans tree
STYLE
Minimal python: one file, one-line comments, ~65-char lines,
very short functions, `i` (not self), records over classes.
Threshold for a new file vs a new gist: parts you *import* stay
in here; *wholes* you *run* (apps, other languages, data) get
their own gist.
LICENSE
MIT. https://choosealicense.com/licenses/mit/
AUTHOR
Tim Menzies <timm@ieee.org>
built by gistsite