http://tiny.cc/klassif
Example classification datasets: 73 CSV files (anneal, audiology, COMPAS, diabetes, soybean, vote, ...) with self-describing headers — the klass column ends in '!', so no separate schema files are needed. Data only, no code.
# install
git clone http://tiny.cc/konfig ../konfig
git clone http://tiny.cc/klassif klassif && cd klassif
make helpSections: NAME | DATA | FILES | SEE ALSO | LICENSE | AUTHOR
NAME
klassif - classification benchmark CSVs. headers are the
schema; the symbolic goal column ends in '!'.
DATA
CSV with self-describing header; no separate schema file:
first char UPPER -> numeric (Num)
first char lower -> symbolic (Sym)
suffix '+' -> numeric goal, maximize
suffix '-' -> numeric goal, minimize
suffix '!' -> symbolic goal (klass)
suffix 'X' -> ignore
else -> predictor
missing value -> '?'
E.g. diabetes.csv: Preg,Plas,Pres,Skin,Insu,Mass,Pedi,Age,klass!
FILES
COMPAS53.csv | anneal.csv | anneal.orig.csv | arrhythmia.csv | audiology.csv | autos.csv | balance.scale.csv | breast.cancer.csv | breast.w.csv | breastcancer.csv | bridges.version1.csv | bridges.version2.csv | car.csv | cmc.csv | colic.csv | colic.orig.csv | column2C.csv | column3C.csv | credit.a.csv | credit.g.csv | cylinder.bands.csv | dermatology.csv | diabetes.csv | ecoli.csv | flags.csv | german.csv | glass.csv | haberman.csv | heart.c.csv | heart.h.csv | heart.statlog.csv | hepatitis.csv | hypothyroid.csv | ionosphere.csv | iris.csv | kr.vs.kp.csv | labor.csv | letter.csv | lymph.csv | mnist_1.csv | mushroom.csv | nursery.csv | optdigits.csv | page.blocks.csv | pendigits.csv | postoperative.patient.data.csv | primary.tumor.csv | segment.csv | shuttle.landing.control.csv | sick.csv | solar.flare1.csv | solar.flare2.csv | sonar.csv | soybean.csv | spambase.csv | spect.test.csv | spect.train.csv | spectf.test.csv | spectf.train.csv | spectrometer.csv | splice.csv | sponge.csv | tae.csv | tic-tac-toe.csv | trains.csv | vehicle.csv | vote.csv | vowel.csv | waveform5000.csv | weather.csv | weathernom.csv | wine.csv | zoo.csv
SEE ALSO
konfig http://tiny.cc/konfig shared Makefile, dotfiles
optimiz http://tiny.cc/optimiz optimization datasets
regress http://tiny.cc/regress regression datasets
luamine http://tiny.cc/luamine code that reads these files
LICENSE
MIT. https://choosealicense.com/licenses/mit/
AUTHOR
Tim Menzies <timm@ieee.org>
built by gistsite
