-

   rss_rss_hh_new

 - e-mail

 

 -

 LiveInternet.ru:
: 17.03.2011
:
:
: 51

:


[ ] ML Boot Camp V, 2

, 09 2017 . 10:19 +

, ML Boot Camp V - .



100 000 , 70% , 10% (public) 20% (private), . , , - () ( 70% 30%). . log loss.


11 :


  • , , ,
  • , (3 : , ) ( 3 ).
  • , , ( )

( ), 10% . . , , , .


-


-, smoke, alco, active. , 10% . 7 - (CV) , smoke, alco, active:


  • ( 10% smoke, alco, active). , (NaN), XGBoost.
  • 10%, . , NaN.
  • NaN
  • smoke, alco,active .

\ , , , train-test.


CV, 10%.


, - , CV , , .



, , CV . (), . CV. public private, CV, public, private ( logloss 0.5, , 370 0.5370, 427.78 0.542778):



, ( , ).


Spearman rho CV Public Private
CV 1 0.723 0.915
Public - 1 0.643
Private - - 1

, - private ( ), public CV private .


: CV, CV NaN ( , ). , , . public-private .




:


  • Regularized Gradient Boosting ( XGBoost) , , . xgb.
  • Neural Networks ( Keras) feed forward networks, autoencoders, baseline xgb, .
  • sklearn , xgb RF, ExtraTrees, ..
  • ( brew) baseline xgb.

, 2-3 xgb (- 3-7 ), , 1-5 xgb.
( bayes_opt), , . , , ( min_child_weight reg_lambda) , graduate student descent.


I


, , :


  • 12000 1200 100 10
  • 10 100 10 1
  • 25 125
  • 70 170
  • ,
  • 0, 40
  • ..

, CV ~0.5375 public ~0.5435, .
.






CV.


II


, , , . CV ~0.5370 ( ~0.5375), public ~0.5431 ( ~0.5435).


. (, 1100 2000) train test. , , 10 , . , . , , 1211 1620, 120 160.


, (, ). , 1/1099 1/2088 110/90 120/80, 14900/90 140/90. , , 585 85, 701 170, 401 140.
, . , 13/0 130/80, . .


, . , 150/60 ( , ) 90 .


, , ( - ).


, , 1379 (1.97%), 194 public (1.94%), 402 private (2.01%). , 2% , CV. , .



365.25, . , . 13 . , CV ~1-2 . .




, , . , ( ). . ( )




BMI ( = $inline$/(/100)^2$inline$) . BMI , , . , cv, . BMI BMI.




, , c 5.
. CV, :


  • Pulse pressure
  • (85 <= ap_hi <= 125 & 55 <= ap_lo <= 85)
  • +
  • (age (age/2).round()*2) > 0

( ) xgb :



2 xgb . github ( CV 0.5370, public 0.5431, private 0.530569 2 ).



xgb , ( , , ) , 8 public 0.5430-31 0.54288. 4 public ( , 0.5431 1, 0.5432 1/2, 0.5433 1/3), , CV. 8 , , (), 9 xgb. , , , , , , NaN. , ( 1/4) public 0.542778 ( 17 , github).


, - -, . ? , 90% CV 0.5370-0.5371, , , . , public , , , 2 private 0.5304688. , , , , 2 , .



, / , . , , ..


, , git, - , , \\ . , , , .


, , . , , , .


, , github.

Original source: habrahabr.ru (comments, light).

https://habrahabr.ru/post/335226/

:  

: [1] []
 

:
: 

: ( )

:

  URL