
Marginal Distributions Estimation and Analysis 

scott.m

[w,bins]=scott(c,normal,factor) 
Calculate bins width and bins vector using the Scott's rule applied to vector 'c'.
normal = 1 : Use Normal rule
normal = 0 : Correct normal rule with a skewness factor
factor : Multiply calculated width by factor.

pdfplot.m 
[p,w,bins]=pdfplot(c,color,flag) 
Plot the PDF of distribution c using Scott's rule to choose the bin width,
if 'flag' argument is given then the Matlab 'plot' command is used instead of 'bars'.

pdfplotbins.m 
p=pdfplotbins(c,w,bins,color,flag) 
Plot the PDF of distribution c with given bins (w,bins are respectively the bin width and the bins' vector),
if 'flag' argument is given then the Matlab 'plot' command is used instead of 'bars'.

bivpdf.m 
bivpdf(seta,wa,setb,wb,bmin,bmax) 
Plot the bivariate PDF of variables 'seta' and 'setb' by using bins widths 'wa' and 'wb'.

ccdf.m 
ccdf(empirical,cutmin) 
Display loglog Complementary CDFs and compute the alfa paramter starting from 'cutmin'.


Entropy 

entropy_base.m 
entropy_base(p,base) 
Calculate the entropy of a single random variable given its PDF in the vector p and the base of the logarithm.

joint_ent.m 
joint_ent(Pxy,base) 
Calculate the joint entropy of a joint PDF "Pxy".

mutual_info.m 
mutual_info(Px,Py,Pxy,base) 
Calculate mutual info between 2 probability distributions.

conditional_ent.m 
conditional_ent(Px,Pxy,base) 
Calculate conditional entropy of a random variable.

sim_uncert.m 
sim_uncert(data,bins) 
Calculate the simmetrical uncertainty of the bivariate dataset data by using the number of bins specified by 'bins'.

umargplot_off.m 
[newset,entr,um]=umargplot_off(empir,n,w) 
Perform a dataset reduction. Based on the OfflineMarginal Utility

umarg.m 
[U]=umarg(Pr1, Pr2) 
Compute the marginal utility between two distributions


Second Order Analysis 

acf_plot.m 
acf_plot(empirical,directory,variable) 
Plot the autocorrelation function of the data specified by empirical and save the figure in eps format.


Statistical Fitting 

lambdasquare.m 
[x2,lambda2,dev] = lambdasquare(empirical, analytical, param, bins, mmin, mmax)

Estimate lambdasquare discrepancy.
param = parameters of the analytical distribution.
bins = custom bins vector or bin width.
min,max:
Use 0,0 to not make any truncations.
Use 1,1 to truncate analytical to match min and max from empirical.
Use different values to truncate both distros at your pleasure.
Returns the Chisquare estimation (x2), the lambdasquare estimation (lamdba2), and its standard deviation.

fitting.m 
fitting(emp_orig,empirical,w,b) 
Perform Statistical fitting by comparing the empirical distribution given against several analytical
distributions by means of the Lambdasquare discrepancy measure.
It displays comparisons of the PDFs of each analytical distribution against the empirical one.
It outputs also data formatted to be inserted into a latex table.

bestlambda.m 
bl = bestlambda(s) 
Simple script needed by "fitting.m". It compares several lambdasquare discrepancy measures taking into account their standard deviations.

workfit.m 
[lam,dev] = workfit(emp_orig, empirical, w, b, an, param, string); 
Simple script needed by "fitting.m". It plots the PDFs of an analytical distribution and of the empirical one in one single diagram.

normmix2.m 
[p, m, s, x, pdf, an] = normmix2(data, iter, flag) 
Perform statistical fitting using the Expectation Maximization algorithm with a mixture of 2 Gaussian distributions.
INPUT
data: data to be fitted
iter: max number of iterations (0 = default)
flag: OPTIONAL:
0 do nothing
1 plot a comparison of the PDFs
OUTPUT
p: percentages
m: param1
s: param2
x: x for the pdf
pdf: y for the pdf
an: OPTIONAL: generate 100k random numbers following the obtained mixture

normmix3.m 
[p, m, s, x, pdf, an] = normmix3(data, iter, flag) 
Same as above, but with a mix of 3 Gaussian distributions.

normmix4.m 
[p, m, s, x, pdf, an] = normmix4(data, iter, flag) 
Same as above, but with a mix of 4 Gaussian distributions.

normmix5.m 
[p, m, s, x, pdf, an] = normmix5(data, iter, flag) 
Same as above, but with a mix of 5 Gaussian distributions.

tail.m 
tail(empirical,analytical) 
Estimate goodness of tail approximation.


Network Traffic Time Series 

bitrate.m 
[x,y] = bitrate(ipt, ps, k, steps, unit, time_unit) 
Calculate bit or packet rate from the interpacket time and packet size series.
INPUT
ipt: inter packet times
ps: vector of packet sizes (set to a single value to calculate the packet rate)
k: sampling period of the calculated rate
steps: how many samples to compute
unit:
set unit = 1 to have byte rate
set unit = 8 to have bit rate (unaffected if packet rate is requested)
time_unit:
if ipt are expressed in seconds set to 1
if ipt are exepressed in ms set to 1000
etc..
Note that ps() has one more element than ipt() !
OUTPUT:
x: time ticks
y: corresponding rate for each tick

bitrate2.m 
[x,y]=bitrate2(time,ps,scale,unit,time_type,time_unit); 
Calculate bit or packet rate from the interpacket time (or timestamps of arrivals) and packet size series.
Much faster than bitrate.m
Can work with the timestamps of arrivals too.


Network Traffic Modeling 

HMM.zip 
Hidden Markov Models library 
This set of scripts implements techniques based on Hidden Markov Models to model, replicate, and predict network traffic.
They have been used in our works, e.g.
"An HMM Approach to Internet Traffic Modeling"
