machine learning - How should zero standard deviation in one of the features be handled in multi-variate gaussian distribution -


i using multi-variate guassian distribution analyze abnormality. how training set looks

19-04-16    05:30:31    1   0   0   377816  305172  5567044 0   0   0   14  62  75  0   0   100 0   0 <date>      <time>     <--------------------------- -------   features ---------------------------> 

lets 1 of above features not change, remain zero.

calculation mean = mu

mu = mean(x)' 

calculating sigma2 as

sigma2 = ((1/m) * (sum((x - mu') .^ 2)))' 

probability of individual feature in each data set calculated using standard gaussian formula as

guassian

for particular feature, if values come out zero, mean (mu) zero. subsequently sigma2 zero. thereby when calculate probability through gaussian distribution, "device zero" problem.

however, in test sets, feature value can fluctuate , term abnormality. how, should handled? dont want ignore such feature.

so - problem occurs every time when have variable constant. approximating normal distribution has absolutely no sense. whole information such variable contained in 1 value - , intuition why division 0 phenomenon occurs.

in case when know there these fluctuations in variable not observed in training set - set variance of such variable not lesser value. apply function max(variance(x), eps) instead of classic variance definition. - sure no division 0 occurs.


Comments

Popular posts from this blog

java - Static nested class instance -

c# - Bluetooth LE CanUpdate Characteristic property -

JavaScript - Replace variable from string in all occurrences -