-

   rss_rss_hh_new

 - e-mail

 

 -

 LiveInternet.ru:
: 17.03.2011
:
:
: 51

:


:

, 23 2017 . 11:16 +
, . , . , .




1.
2.
3.

Python Azure


, Azure Python-. zip- (, , ).

: . , , . ( ).



, ( ) .

. MS Analysis Service


. data mining.

, , : . :

  • ;
  • id ;
  • 3 .

, . MS Analysis Services, , .



. , 19 5 . 64 14 .

, .



. , 85% 100%, . . , . , 90 .

. , .


, . : logistic regression, support vector machine, decision jungle/random forest, Bayesian network. , .

, . .

Logistic regression


, . , .

. , 1/(1+e^(-score) ),, score , . -1 +1. .



:

  • Optimization tolerance . , . 1-7.

  • L1 & L2 regularization 1 2 . L1 , . L2 c , . 1.

  • L-BFGS () . ( ), . : , . , . . ( ). , 10.

  • Random seed .



, . , 5%, 15%. , , . , . .

Support Vector Machine


, . . , . , .



:

  • Number of iterations . . 1 10.
  • Lambda L1
  • Normalize features . .
  • Project to unit sphere . .
  • Allow unknown category , . , , .
  • Random seed .



, . . , , , .

Bayesian network


, . , , . . , , , .



:

  • Number of training iterations SVM.
  • Include bias . , .
  • Allow unknown values SVM.



, . , .

Random Forest


. ( , ). , . . , .

m (m , , ). . , , .



:

  • Resampling method bagging, . replicate, .
  • Number of decision trees. , , .
  • Maximum depth . 32 . .
  • Number of random splits per node . , 10 30.
  • Minimum number of samples per leaf node ( ). 1, .
  • Allow unknown values SVM.



Random forest , 8% 13%. - . , , .

, , create trainer mode. Parameter Range, , , . .

cross-validation variance


- . , . . . .

, Random Forest.



. , .


:

  1. Python- .
  2. MS Analysis Services .
  3. - .

. , .


WaveAccess , . , machine learning WaveAccess:
, . .
Original source: habrahabr.ru (comments, light).

https://habrahabr.ru/post/331484/

:  

: [1] []
 

:
: 

: ( )

:

  URL