I have network traffic data, data volume (number of bytes) and # streams for the week period for a pair of source and destination IP addresses. I want to tell the distribution, that is, the frequency versus rank. I believe that for this there is a function already provided by R. What it is and how to use this function for my script.
I found out that the Zipf graph is just a logarithmic graph of the entity’s frequency (say, “flows”), sorted in descending order.
zipfR -, : zipfR : .
You don't seem to need a special function:
x <- rpois(1000, 10) tbl <- table(x) plot(seq_along(tbl), unclass(tbl))
Or are you looking for hist?
hist
hist(x)
This should be a comment on the hasley answer, but the original question is looking for:
plot(log10(seq_along(tbl)), log10(unclass(tbl)))
The package tm(text mining) has a mechanism for building Zipf.
tm
Zipf_plot (x, type = "l", ...)