The way calcSD gets its results is simple. There's many researches/studies out there that try to determine the average dick size and how common it is, all the website does it take data from these studies and process it. In this page you can read more about the studies and the calculations done.

**The calculations use data from the dataset you select**, each one providing an average (a.k.a. mean) and a standard deviation for each dimension (length, girth, flaccid, etc.). More info about each one is available on the Dataset List page. There's also a small table under the main calculator for a quick reference on each dataset.

**The calcualtor's default dataset is the recommended one**, but you can choose other ones as well. Feel free to compare them. Outdated ones are marked as such and using those is generally not recommended. Data from these studies is only based on a small sample of the population, but they are reliable enough for general purposes. **As always, don't get too attached to the statistics.**

If you feel like the studies are wrong or if you are skeptical about it, this page might help you.

The hosting service used does not allow for any server-side processing, which means that everything has to be done via JavaScript (ex. if there's a need to do an addition to get a value, your browser does the addition while the page is open rather than the server doing the addition and transmitting the result to you), which restricts what can be done with the website.

Using an average and a standard deviation, the website calculates a z-score (a.k.a. standard score), which represents how many standard deviations you are above or below average. For example, if the average length is 13cm and the standard deviation is 1cm, then at 14cm your z-score would be +1 which is one above the mean. At 16cm you'd have a z-score of +3, at 10cm you'd have a z-score of -2 and so forth.

Afterwards, it will convert the z-score into a percentile. It assumes a **normal distribution** (a.k.a. a bell curve), meaning it assumes the highest amount of people are at the average and the quantity gradually decreases from there, which each side of the average decreasing at equal speeds (the rarity of a -3 z-score would be the same as of a +3 z-score). **This is only an approximation**, meaning if you're in either extreme of the bell curve you're likely to have a difficult time comparing against the data as there's less and less people to compare to.

After a percentile is gathered, it's not difficult to convert it to a number, which is how the values are compared against a room of *n* guys in the end. Fun fact: the 0 and 100 percentiles don't exist. Saying you're in the 99 percentile means you're higher than 99% of the population, but saying you're on the 100 percentile means you're higher than 100% of the population, including yourself...which is a contradiction. Same goes for the 0 percentile but in reverse.

After all that, the only mistery is the volume and how it's calculated. The volume of the measurements inserted are calculated using the following formula:

Assuming a perfectly cylindrical shape (not the case in real-life), the circumference/girth (C) is divided by two times π, then the result is squared. After that, multiply it by π again and mulitply it by the length.

The statistics for these however, are a bit more complicated. In order to calculate that, I'd need first an average volume and a correlation value, which would show how correlated the length the girth measurements are, and afterwards process it as a *multivariate/bivariate normal distribution*. As of currently, I wasn't able to find an average and a correlation value for volume in the studies, neither was I able to find a way to calculate this specific type of distribution in JavaScript. Someone was able to do it using the R programming language, but that doesn't really work here.

Long story short, I had to get creative. I had an Excel file calculate for me, for each dataset:

- The length/girth value for each 0.1% increment, which generated 999 values each.
- The volume for each combination of length and girth in the increments generated before, making a 999x999 grid with a total of 998001 values.
- The average/mean of all those values to make sure they were close to the volume of an average-sized member.
- The standard deviation of all those values.

Unfortunately in step 3 I realized that a slight error margin happens when I do these calculations this way, **meaning there's an added error margin regarding volume percentiles.** This error ranges from 1-5ml depending on the dataset. This means that, if the normal calculations were an estimate using statistics, the volume calculations are an estimate of the estimate, which explains why they can end up inaccurate.

That file that I still have occupies 17.5 MB and contains only formulas and text, just so you have an idea. Curiously, the exact difference seems to be directly correlated to both the average/mean length and the girth's standard deviation but not the other two values.

If anyone knows a better way to calculate all this, please feel free to contact me.