, . , . , .
1.
2.
3.
Python Azure
, Azure Python-. zip- (, , ).
: . , , . ( ).
, ( ) .
. MS Analysis Service
. data mining.
, , : . :
, . MS Analysis Services, , .
. , 19 5 . 64 14 .
, .
. , 85% 100%, . . , . , 90 .
. , .
, . : logistic regression, support vector machine, decision jungle/random forest, Bayesian network. , .
, . .
Logistic regression
, . , .
. , 1/(1+e^(-score) ),, score , . -1 +1. .
:
- Optimization tolerance . , . 1-7.
- L1 & L2 regularization 1 2 . L1 , . L2 c , . 1.
- L-BFGS () . ( ), . : , . , . . ( ). , 10.
- Random seed .
, . , 5%, 15%. , , . , . .
Support Vector Machine
, . . , . , .
:
- Number of iterations . . 1 10.
- Lambda L1
- Normalize features . .
- Project to unit sphere . .
- Allow unknown category , . , , .
- Random seed .
, . . , , , .
Bayesian network
, . , , . . , , , .
:
- Number of training iterations SVM.
- Include bias . , .
- Allow unknown values SVM.
, . , .
Random Forest
. ( , ). , . . , .
m (m , , ). . , , .
:
- Resampling method bagging, . replicate, .
- Number of decision trees. , , .
- Maximum depth . 32 . .
- Number of random splits per node . , 10 30.
- Minimum number of samples per leaf node ( ). 1, .
- Allow unknown values SVM.
Random forest , 8% 13%. - . , , .
, , create trainer mode. Parameter Range, , , . .
cross-validation variance
- . , . . . .
, Random Forest.
. , .
:
- Python- .
- MS Analysis Services .
- - .
. , .
WaveAccess , . , machine learning WaveAccess:
, . .
https://habrahabr.ru/post/331484/