machine learning - How should zero standard deviation in one of the features be handled in multi-variate gaussian distribution -

- March 15, 2013

i using multi-variate guassian distribution analyze abnormality. how training set looks

19-04-16    05:30:31    1   0   0   377816  305172  5567044 0   0   0   14  62  75  0   0   100 0   0 <date>      <time>     <--------------------------- -------   features --------------------------->

lets 1 of above features not change, remain zero.

calculation mean = mu

mu = mean(x)'

calculating sigma2 as

sigma2 = ((1/m) * (sum((x - mu') .^ 2)))'

probability of individual feature in each data set calculated using standard gaussian formula as

for particular feature, if values come out zero, mean (mu) zero. subsequently sigma2 zero. thereby when calculate probability through gaussian distribution, "device zero" problem.

however, in test sets, feature value can fluctuate , term abnormality. how, should handled? dont want ignore such feature.

so - problem occurs every time when have variable constant. approximating normal distribution has absolutely no sense. whole information such variable contained in 1 value - , intuition why division 0 phenomenon occurs.

in case when know there these fluctuations in variable not observed in training set - set variance of such variable not lesser value. apply function max(variance(x), eps) instead of classic variance definition. - sure no division 0 occurs.

Search This Blog

Prevent

machine learning - How should zero standard deviation in one of the features be handled in multi-variate gaussian distribution -

Comments

Post a Comment

Popular posts from this blog

github - Git errors while pushing -

django - (fields.E300) Field defines a relation with model 'AbstractEmailUser' which is either not installed, or is abstract -

Unity3d perpendicular vector3 -