-

   rss_rss_hh_new

 - e-mail

 

 -

 LiveInternet.ru:
: 17.03.2011
:
:
: 51

:


[ ] ML Boot Camp V, 3

, 08 2017 . 17:23 +
ML Boot Camp V Mail.Ru. - . . .

ML Boot Camp III - 2017 - . 5 kaggle . , , 3 .


100.000 . , , , , , .

, , . , .

50 30+ , 16020, . .


python :


Csv vs pickle


csv . pickle 2 . :

with gzip.open('../run/local/pred_1.pickle.gz', 'wb') as f:
    pickle.dump((x, y), f)


, github. old/ , , , . - , .

2


2 , . . - , , . , .

2


2 . .

: -> -> 1 -> 2 . , , , . . , , , . , .

, . , .


2 1 . - , . . ( ), . , .

, . Random subspace method, , random . ( * (2 ^ )). , , , 2 .


, - . - . , .

, . (2) xgboost. .

:

  1. 0.0001, , .
  2. : 0, 1. .
  3. .
  4. ( .2) , .
  5. ( .3) , , .
  6. , NaN.


. , , - .. .

, . . , .

:

  1. , , $inline$\frac{ap\_hi+x*ap\_lo}{x+1}$inline$ x. / ( $inline$ap\_X = a + b * age + c * weight$inline$). .

  2. , .1, . . .
  3. , , . (ord()). -1.

  4. , .3, (one-hot encoding).

  5. .4, PCA kaggle mercedes, .

  6. . , 10 , . 10 9 (- ). , , .

  7. .6, .2.

  8. .7, 5.

  9. .7, 3.

  10. k-means, 2, 5, 10, 15, 25. .

  11. .10, , 3.


( ), , . . , . - , . , - . .

. . , .

. . 3 :

  1. , , ;
  2. , ;
  3. , , ;

, python. . 2 , , .

, 0 1 . logloss- , 0 1 1e-5. np.clip(z, 1e-5, 1-1e-5) . , 0.1-0.93.

hyperopt


hyperopt (). , 20. 2 hyperopt bootstrapping 20 , . .

1


1 0 . . .

, 1 - . 2 :

  • (keras)
  • (XGBoost, LightGBM, rf, et)

. hyperopt.


- , . 64-64 leaky relu 1-5 , .

:

  • ;
  • ( 256);
  • - , ( 0.7, , ); nan- batch normalization ;
  • - (64-128);
  • ;
  • - (16);
  • ;
  • 1 .

. , , 2 .

(- 0 ), ReLU (- , , 0, ) - . Parametric Relu, Scaled Exponential Linear Units. - .

, KFold sklearn. , .

, , . , . callback- keras , learning rate .

, ( ) learning rate - . .

, callback- . callback- . learning rate , , , callback-.

,


2 bagging random forest extra trees 2 XGBoost LightGBM. - , - , , . LightGBM XGBoost .

( 3) . 2 . 1 .

LightGBM XGBoost , . 10000 . . random forest extra trees sklearn , hyperopt, , , . , , .


. 1 . . , , 1 , . 2 .

2


, 1 1 4 , 2 190 . . 2 1 ( 2 ).

2 , , .

2 . , .

, , , . 2 . BayesianRidge, Ridge . 20 .

hyperopt , - BayesianRidge Ridge sklearn, BayesianRidge Ridge.


10 . cv 0.534-0.535 0.543-0.544, . , 30 . 30 10 .

0.535-0.536, 0.543 . 3 30 30 0.7 0.3 . 30 cv . random_state. 0.537.

, , , . 2 0.543 0.538 . , 12 7 3 , .
Original source: habrahabr.ru (comments, light).

https://habrahabr.ru/post/335188/

:  

: [1] []
 

:
: 

: ( )

:

  URL