AuthorLanguageLicensePurpose

http://tiny.cc/optimiz

Example multi-objective optimization datasets: 100+ CSV files with self-describing headers — column names encode type and goal, so no separate schema files are needed. Topics span software config, hyper-parameter tuning, process models, health, finance, sales, and RL. Data only, no code.

# install
git clone http://tiny.cc/konfig ../konfig
git clone http://tiny.cc/optimiz optimiz && cd optimiz
make help

qr

Sections: NAME | DATA | FILES | SEE ALSO | LICENSE | AUTHOR

NAME

optimiz - multi-objective optimization CSVs. headers are the
schema; goal columns end in '+' (maximize) or '-' (minimize).

DATA

CSV with self-describing header; no separate schema file:

  first char UPPER  -> numeric (Num)
  first char lower  -> symbolic (Sym)
  suffix '+'        -> numeric goal, maximize
  suffix '-'        -> numeric goal, minimize
  suffix '!'        -> symbolic goal (klass)
  suffix 'X'        -> ignore
  else              -> predictor
  missing value     -> '?'

E.g. auto93.csv: Clndrs,Volume,HpX,Lbs-,Acc+,Model,origin,Mpg+

FILES

Gists are flat, so filenames keep their source-topic prefix:

  behavior_data_*   people: HR attrition, dropout, players
  binary_config_*   binary option models (FFM/FM, SAT)
  config_*          software config spaces (SS-*, Apache, SQL)
  financial_data_*  bonds, loans, markets
  health_data_*     medical + wellbeing tables
  hpo_*             hyper-parameter optimization landscapes
  misc_*            small classics (auto93, wine, ...)
  process_*         software process models (pom, xomo)
  rl_*              reinforcement-learning traces
  sales_data_*      retail + marketing tables
  systems_*         computer-systems tuning
  test_*            tiny smoke-test tables

SS-N.csv | auto93.csv | behavior_data_WA_Fn-UseC_-HR-Employee-Attrition.csv | behavior_data_all_players.csv | behavior_data_player_statistics_cleaned_final.csv | behavior_data_student_dropout.csv | binary_config_FFM-1000-200-0.50-SAT-1.csv | binary_config_FFM-125-25-0.50-SAT-1.csv | binary_config_FFM-250-50-0.50-SAT-1.csv | binary_config_FFM-500-100-0.50-SAT-1.csv | binary_config_FM-500-100-0.25-SAT-1.csv | binary_config_FM-500-100-0.50-SAT-1.csv | binary_config_FM-500-100-0.75-SAT-1.csv | binary_config_FM-500-100-1.00-SAT-1.csv | binary_config_Scrum100k.csv | binary_config_Scrum10k.csv | binary_config_Scrum1k.csv | binary_config_billing10k.csv | config_Apache_AllMeasurements.csv | config_HSMGP_num.csv | config_SQL_AllMeasurements.csv | config_SS-A.csv | config_SS-B.csv | config_SS-C.csv | config_SS-D.csv | config_SS-E.csv | config_SS-F.csv | config_SS-G.csv | config_SS-H.csv | config_SS-I.csv | config_SS-J.csv | config_SS-K.csv | config_SS-L.csv | config_SS-M.csv | config_SS-N.csv | config_SS-O.csv | config_SS-P.csv | config_SS-Q.csv | config_SS-R.csv | config_SS-S.csv | config_SS-T.csv | config_SS-U.csv | config_SS-V.csv | config_SS-W.csv | config_SS-X.csv | config_X264_AllMeasurements.csv | config_rs-6d-c3_obj1.csv | config_rs-6d-c3_obj2.csv | config_sol-6d-c2-obj1.csv | config_wc+rs-3d-c4-obj1.csv | config_wc+sol-3d-c4-obj1.csv | config_wc+wc-3d-c4-obj1.csv | config_wc-6d-c1-obj1.csv | financial_data_BankChurners.csv | financial_data_Loan.csv | financial_data_WA_Fn-UseC_-Telco-Customer-Churn.csv | financial_data_home_data_for_ml_course.csv | health_data_Data_COVID19_Indonesia.csv | health_data_Life_Expectancy_Data.csv | health_data_Medical_Data_and_Hospital_Readmissions.csv | hpo_Health-ClosedIssues0000.csv | hpo_Health-ClosedIssues0001.csv | hpo_Health-ClosedIssues0002.csv | hpo_Health-ClosedIssues0003.csv | hpo_Health-ClosedIssues0004.csv | hpo_Health-ClosedIssues0005.csv | hpo_Health-ClosedIssues0006.csv | hpo_Health-ClosedIssues0007.csv | hpo_Health-ClosedIssues0008.csv | hpo_Health-ClosedIssues0009.csv | hpo_Health-ClosedIssues0010.csv | hpo_Health-ClosedIssues0011.csv | hpo_Health-ClosedPRs0000.csv | hpo_Health-ClosedPRs0002.csv | hpo_Health-ClosedPRs0003.csv | hpo_Health-ClosedPRs0004.csv | hpo_Health-ClosedPRs0005.csv | hpo_Health-ClosedPRs0006.csv | hpo_Health-ClosedPRs0007.csv | hpo_Health-ClosedPRs0008.csv | hpo_Health-ClosedPRs0009.csv | hpo_Health-ClosedPRs0010.csv | hpo_Health-ClosedPRs0011.csv | hpo_Health-Commits0000.csv | hpo_Health-Commits0001.csv | hpo_Health-Commits0002.csv | hpo_Health-Commits0003.csv | hpo_Health-Commits0004.csv | hpo_Health-Commits0005.csv | hpo_Health-Commits0006.csv | hpo_Health-Commits0007.csv | hpo_Health-Commits0008.csv | hpo_Health-Commits0009.csv | hpo_Health-Commits0010.csv | hpo_Health-Commits0011.csv | misc_Car_price_cleaned.csv | misc_Wine_quality.csv | misc_auto93.csv | misc_multiLabel.csv | process_coc1000.csv | process_nasa93dem.csv | process_pom3a.csv | process_pom3b.csv | process_pom3c.csv | process_pom3d.csv | process_xomo_flight.csv | process_xomo_ground.csv | process_xomo_osp.csv | process_xomo_osp2.csv | rl_A2C_Acrobot.csv | rl_A2C_CartPole.csv | sales_data_Marketing_Analytics.csv | sales_data_accessories.csv | sales_data_dress-up.csv | sales_data_socks.csv | sales_data_wallpaper.csv | systems_7z.csv | systems_BDBC.csv | systems_HSQLDB.csv | systems_LLVM.csv | systems_PostgreSQL.csv | systems_dconvert.csv | systems_deeparch.csv | systems_exastencils.csv | systems_javagc.csv | systems_redis.csv | systems_storm.csv | systems_x264.csv | test_dataset120.csv | test_dataset600.csv

SEE ALSO

konfig    http://tiny.cc/konfig   shared Makefile, dotfiles
regress   http://tiny.cc/regress  regression datasets
klassif   http://tiny.cc/klassif  classification datasets
luamine   http://tiny.cc/luamine  code that reads these files

LICENSE

MIT. https://choosealicense.com/licenses/mit/

AUTHOR

Tim Menzies <timm@ieee.org>

built by gistsite