Verifying the fraction

02 February 2026
monte-carlo probability c++

Background

In my last post we worked through a probability problem under heavy duress. Three points chosen uniformly at random from a unit disk - what's the probability their circumcircle fits inside?

After pages of Jacobians, angular integrals, and trig identities, I arrived at $P = 2 / 5$ . But I ended the post with some uncertainty:

You wonder if you actually got the right answer. It was a nice fraction but did you not overcount? Undercount? Did you mess up the integral?

Time to check. The beautiful thing about probability problems is that you can often verify analytical results empirically. Generate a million random instances, count successes, divide. Let's find out.

The plan

Sample three points uniformly from the unit disk
Compute their circumcircle (we can find a closed form expression)
Check if it fits inside the unit disk
Repeat a few million times

Simple enough. But there are a couple of subtleties worth walking through.

Sampling uniformly from a disk

Our first instinct might be: sample $x$ and $y$ uniformly from $[- 1, 1]$ , reject if $x^{2} + y^{2} > 1$ . This does work - we get uniform density inside the disk. But it's wasteful (you throw away $1 - π / 4 \approx 21$ of samples).

The cleaner approach is polar coordinates. But here's where intuition can betray us. If we sample:

$r \sim Uniform (0, 1)$
$θ \sim Uniform (0, 2 π)$

we do not get uniform density in the disk. We tend to oversample near the center.

Why? Think about concentric rings. A thin annulus at radius $r$ has area $2 π r d r$ . The area grows linearly with $r$ . If we spread points evenly across radius values, the outer regions (which have more area) get underpopulated.

We want to sample points uniformly by area inside the unit disk. "Uniform by area" means: if we take any two regions with the same area, they should have the same probability of containing our sampled point.

Step 1: What's the probability of landing within radius $r$ ?

For uniform sampling, probability is proportional to area. The area inside radius $r$ is $π r^{2}$ . The total area of the unit disk is $π \cdot 1^{2} = π$ .

So the probability of our point having radius $\leq r$ is:

$P (R \leq r) = \frac{area inside r}{total area} = \frac{π r^{2}}{π} = r^{2}$

This is the CDF (cumulative distribution function) of the radius $R$ .

Step 2: The inverse transform method

Here's a general trick for sampling from any distribution. If we have:

$U \sim Uniform (0, 1)$
$F (x)$ is our desired CDF, which must be strictly monotonic

Then $X = F^{- 1} (U)$ has the distribution we want.

To verify: we need $P (X \leq x) = F (x)$ . Since $F$ is monotonic, $F^{- 1} (U) \leq x$ if and only if $U \leq F (x)$ . And since $F$ is a CDF, $F (x) \in [0, 1]$ , so we can use the defining property of the uniform distribution:

$P (X \leq x) = P (F^{- 1} (U) \leq x) = P (U \leq F (x)) = F (x) ✓$

Step 3: Apply to our case

Our CDF is $F (r) = r^{2}$ . To invert it:

$u = r^{2} ⟹ r = \sqrt{u}$

So if $U \sim Uniform (0, 1)$ , then $R = \sqrt{U}$ has CDF $P (R \leq r) = r^{2}$ , which is exactly the distribution we need for uniform area sampling.

Intuition check: When $U$ is uniform on $[0, 1]$ , $\sqrt{U}$ is biased toward larger values. For example:

$P (R > 0.5) = P (\sqrt{U} > 0.5) = P (U > 0.25) = 0.75$

So 75% of points land in the outer half of the radius range - which makes sense because the outer annulus has more area!

So the recipe is: $r = \sqrt{U_{1}}, θ = 2 π U_{2}$ $x = r \cos θ, y = r \sin θ$

Computing the circumcircle

Given three points $P_{1}, P_{2}, P_{3}$ , the circumcenter $(X, Y)$ is equidistant from all three: $(X - x_{i})^{2} + (Y - y_{i})^{2} = R^{2} for i = 1, 2, 3$

Expanding $P_{1}$ 's equation: $X^{2} - 2 X x_{1} + x_{1}^{2} + Y^{2} - 2 Y y_{1} + y_{1}^{2} = R^{2}$

If we expand $P_{3}$ 's equation and subtract, the $X^{2}$ , $Y^{2}$ , and $R^{2}$ terms cancel: $- 2 X (x_{1} - x_{3}) - 2 Y (y_{1} - y_{3}) + (x_{1}^{2} - x_{3}^{2}) + (y_{1}^{2} - y_{3}^{2}) = 0$

Rearranging: $X (x_{1} - x_{3}) + Y (y_{1} - y_{3}) = \frac{1}{2} [(x_{1}^{2} - x_{3}^{2}) + (y_{1}^{2} - y_{3}^{2})]$

Similarly for $P_{2}$ minus $P_{3}$ : $X (x_{2} - x_{3}) + Y (y_{2} - y_{3}) = \frac{1}{2} [(x_{2}^{2} - x_{3}^{2}) + (y_{2}^{2} - y_{3}^{2})]$

This is a $2 \times 2$ linear system in $X$ and $Y$ . Let: $a_{x} = x_{1} - x_{3}, a_{y} = y_{1} - y_{3}$ $b_{x} = x_{2} - x_{3}, b_{y} = y_{2} - y_{3}$ $s_{1} = (x_{1}^{2} + y_{1}^{2}) - (x_{3}^{2} + y_{3}^{2})$ $s_{2} = (x_{2}^{2} + y_{2}^{2}) - (x_{3}^{2} + y_{3}^{2})$

The system becomes: $[\begin{matrix} a_{x} & a_{y} \\ b_{x} & b_{y} \end{matrix}] [\begin{matrix} X \\ Y \end{matrix}] = \frac{1}{2} [\begin{matrix} s_{1} \\ s_{2} \end{matrix}]$

By Cramer's rule:

An aside on Cramer

Cramer's rule solves linear systems using determinants. For a $2 \times 2$ system:

$[\begin{matrix} a & b \\ c & d \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} e \\ f \end{matrix}]$

The solution is:

$x = \frac{| \begin{matrix} e & b \\ f & d \end{matrix} |}{| \begin{matrix} a & b \\ c & d \end{matrix} |} = \frac{e d - b f}{a d - b c}$

$y = \frac{| \begin{matrix} a & e \\ c & f \end{matrix} |}{| \begin{matrix} a & b \\ c & d \end{matrix} |} = \frac{a f - e c}{a d - b c}$

The pattern: to solve for a variable, replace its column in the coefficient matrix with the right-hand side vector, then divide that determinant by the original determinant.

Applying to the circumcircle:

Our system (after the algebra) is:

$[\begin{matrix} a_{x} & a_{y} \\ b_{x} & b_{y} \end{matrix}] [\begin{matrix} X \\ Y \end{matrix}] = \frac{1}{2} [\begin{matrix} s_{1} \\ s_{2} \end{matrix}]$

where:

$a_{x} = x_{1} - x_{3}$ , $a_{y} = y_{1} - y_{3}$
$b_{x} = x_{2} - x_{3}$ , $b_{y} = y_{2} - y_{3}$
$s_{1} = (x_{1}^{2} + y_{1}^{2}) - (x_{3}^{2} + y_{3}^{2})$
$s_{2} = (x_{2}^{2} + y_{2}^{2}) - (x_{3}^{2} + y_{3}^{2})$

Let $D^{'} = a_{x} b_{y} - a_{y} b_{x}$ (the determinant of the coefficient matrix).

By Cramer's rule:

$X = \frac{1}{2} \cdot \frac{| \begin{matrix} s_{1} & a_{y} \\ s_{2} & b_{y} \end{matrix} |}{D^{'}} = \frac{s_{1} b_{y} - s_{2} a_{y}}{2 D^{'}}$

$Y = \frac{1}{2} \cdot \frac{| \begin{matrix} a_{x} & s_{1} \\ b_{x} & s_{2} \end{matrix} |}{D^{'}} = \frac{a_{x} s_{2} - b_{x} s_{1}}{2 D^{'}}$

In the code, I fold the factor of 2 into $D$ by defining $D = 2 D^{'} = 2 (a_{x} b_{y} - a_{y} b_{x})$ , which gives:

$X = \frac{s_{1} b_{y} - s_{2} a_{y}}{D}, Y = \frac{a_{x} s_{2} - b_{x} s_{1}}{D}$

D = 2 (a_{x} b_{y} - a_{y} b_{x})

X = \frac{s_{1} b_{y} - s_{2} a_{y}}{D}, Y = \frac{a_{x} s_{2} - b_{x} s_{1}}{D}

When $D = 0$ , the points are collinear and there's no finite circumcircle. In our simulation, three random points being exactly collinear is a measure-zero event - probability zero in the continuous setting. But we still handle it in code because floating-point arithmetic can produce near-collinear configurations. The 1e-12 tolerance catches these numerical edge cases rather than dividing by something tiny and getting nonsense.

Once we have the circumcenter, the circumradius is the distance from the center to any of the three points.

The containment check

The circumcircle fits inside the unit disk if and only if every point on the circumcircle is at distance $\leq 1$ from the origin. The farthest point on the circumcircle from the origin is at distance: $\sqrt{X^{2} + Y^{2}} + R$

So our condition is simply: $\sqrt{X^{2} + Y^{2}} + R \leq 1$

The code

Here's the full implementation. AFAIK M_PI is a POSIX extension and so make sure to define that if you'd like to run this on Windows.

#include <iomanip>
#include <iostream>
#include <random>

struct Point {
    double x, y;
};

struct Circle {
    double cx, cy, r;
    bool valid;
};

// Sample a point uniformly from the unit disk
Point sample_unit_disk(std::mt19937 &rng,
                       std::uniform_real_distribution<>
                           &dist) {
    double u1 = dist(rng);
    double u2 = dist(rng);

    // r = sqrt(U) gives uniform area distribution
    double r = std::sqrt(u1);
    double theta = 2.0 * M_PI * u2;

    return {r * std::cos(theta), r * std::sin(theta)};
}

// Compute the circumcircle of three points
Circle circumcircle(const Point &p1, const Point &p2, const Point &p3) {
    double ax = p1.x - p3.x;
    double ay = p1.y - p3.y;
    double bx = p2.x - p3.x;
    double by = p2.y - p3.y;

    double D = 2.0 * (ax * by - ay * bx);

    // Collinear points
    if (std::abs(D) < 1e-12) {
        return {0, 0, 0, false};
    }

    double s1 = (p1.x * p1.x + p1.y * p1.y) - (p3.x * p3.x + p3.y * p3.y);
    double s2 = (p2.x * p2.x + p2.y * p2.y) - (p3.x * p3.x + p3.y * p3.y);

    double cx = (s1 * by - s2 * ay) / D;
    double cy = (ax * s2 - bx * s1) / D;
    double r = std::sqrt((cx - p1.x) * (cx - p1.x) + (cy - p1.y) * (cy - p1.y));

    return {cx, cy, r, true};
}

bool circle_inside_unit_disk(const Circle &c) {
    double dist_to_origin = std::sqrt(c.cx * c.cx + c.cy * c.cy);
    return (dist_to_origin + c.r) <= 1.0;
}

int main(int argc, const char *argv[]) {
    long long N = 1'000'000;
    if (argc > 1) {
        N = std::stoll(argv[1]);
    }

    std::random_device rd;
    std::mt19937 rng(rd());
    std::uniform_real_distribution<> dist(0.0, 1.0);

    long long successes = 0;
    long long collinear_skipped = 0;

    for (long long i = 0; i < N; ++i) {
        Point p1 = sample_unit_disk(rng, dist);
        Point p2 = sample_unit_disk(rng, dist);
        Point p3 = sample_unit_disk(rng, dist);

        Circle circ = circumcircle(p1, p2, p3);

        if (!circ.valid) {
            collinear_skipped++;
            continue;
        }

        if (circle_inside_unit_disk(circ)) {
            successes++;
        }
    }

    long long valid_trials = N - collinear_skipped;
    double probability = static_cast<double>(successes) / valid_trials;

    std::cout << std::fixed << std::setprecision(6);
    std::cout << "Trials:     " << N << "\n";
    std::cout << "Successes:  " << successes << "\n";
    std::cout << "Estimated:  " << probability << "\n";
    std::cout << "Expected:   " << 2.0 / 5.0 << "\n";

    return 0;
}

Compile with:

g++ -O3 -std=c++17 -o circumcircle circumcircle.cpp

Results

Running with 10 million trials:

Trials:     10000000
Successes:  3999797
Estimated:  0.399980
Expected:   0.400000

We get a relative error of about 0.005%. The analytical answer holds up. Note that different runs will obviously give us slightly different answers but we should hover around the 40% mark.

Convergence and confidence

Monte Carlo estimation is really just the law of large numbers in action. We're approximating an expectation (the true probability $p$ ) by an average of independent samples (our indicator variables for "circumcircle fits").

The central limit theorem view

Let $X_{i} = 1$ if trial $i$ succeeds, $0$ otherwise. Each $X_{i}$ is a Bernoulli random variable with $E [X_{i}] = p$ and $Var (X_{i}) = p (1 - p)$ .

Our estimator is the sample mean: $\hat{p} = \frac{1}{N} \sum_{i = 1}^{N} X_{i}$

By the central limit theorem, for large $N$ : $\hat{p} \approx N (p, \frac{p (1 - p)}{N})$

The standard error (standard deviation of our estimator) is: $SE = \sqrt{\frac{p (1 - p)}{N}}$

For $p = 0.4$ and $N = 10^{7}$ : $SE = \sqrt{\frac{0.4 \times 0.6}{10^{7}}} = \sqrt{\frac{0.24}{10^{7}}} \approx 0.000155$

Interpreting our result

Our estimate was $\hat{p} = 0.399980$ . The deviation from the true value is: $| \hat{p} - p | = | 0.399980 - 0.400000 | = 0.000020$

In units of standard errors (often called "z-score" or "sigma"): $z = \frac{| \hat{p} - p |}{SE} = \frac{0.000020}{0.000155} \approx 0.13$

A 0.13-sigma deviation is completely unremarkable. For reference:

~68% of runs will fall within 1 sigma ( $\pm 0.000155$ )
~95% will fall within 2 sigma ( $\pm 0.000310$ )
~99.7% will fall within 3 sigma ( $\pm 0.000465$ )

Our result landing 0.13 sigma from the truth is exactly what we'd expect from a working simulation.

The $O (1 / \sqrt{N})$ convergence rate

The standard error scales as $1 / \sqrt{N}$ . This has a practical consequence: to halve your error, you need four times as many samples.

Samples $N$	Standard Error	Accurate digits
$10^{4}$	~0.0049	~2
$10^{6}$	~0.00049	~3
$10^{8}$	~0.000049	~4
$10^{10}$	~0.0000049	~5

To verify that $p = 0.4$ to 6 decimal places with 95% confidence, you'd need the 2-sigma interval to be smaller than $0.0000005$ . That means $SE < 0.00000025$ , requiring: $N > \frac{0.24}{(0.00000025)^{2}} \approx 4 \times 10^{12}$

Four trillion samples! At that point, trusting the algebra is the better strategy.

Conclusion

The hoodlum's problem has a verified answer: $P = 2 / 5$ .

Monte Carlo methods won't prove your closed-form solution is correct, but they're excellent at catching mistakes. If I'd botched the Jacobian or miscounted a factor somewhere, this simulation would have returned something noticeably different from 0.4.

It didn't. I can sleep easy.

← Previous
Dark Alley Mathematics