<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://drspoulsen.github.io/feed.xml" rel="self" type="application/atom+xml" /><link href="http://drspoulsen.github.io/" rel="alternate" type="text/html" /><updated>2026-03-09T17:06:55+00:00</updated><id>http://drspoulsen.github.io/feed.xml</id><title type="html">drsp (Dylan R. S. Poulsen)</title><subtitle>Mathematician and Educator Seeking to Spread Mathematical Joy</subtitle><entry><title type="html">Accessible Coding Environments with Docker, Dev Containers, and GitHub Codespaces</title><link href="http://drspoulsen.github.io/Accessible-Coding-Environments/" rel="alternate" type="text/html" title="Accessible Coding Environments with Docker, Dev Containers, and GitHub Codespaces" /><published>2026-03-09T00:00:00+00:00</published><updated>2026-03-09T00:00:00+00:00</updated><id>http://drspoulsen.github.io/Accessible-Coding-Environments</id><content type="html" xml:base="http://drspoulsen.github.io/Accessible-Coding-Environments/"><![CDATA[<p><em>AI disclosure: This post is based off a voice memo by the author that was subsequently edited using Claude Opus 4.6 and manual editing.</em></p>

<p>The setup I describe here has become central to how I teach data science, and I want to share it so that others can put it into practice themselves. But first, I want to talk about why we, as educators who ask their students to code, should make the effort.</p>

<h2 id="the-ethical-case-for-accessible-coding-environments">The Ethical Case for Accessible Coding Environments</h2>

<p>Many of my students come into class with older technology. Some have loaner laptops that they have to keep trading out and therefore cannot install software on. Some only have a phone or an iPad. Some can only use library or lab computers to do their work. For these students, it is unreasonable and inaccessible to require them to install software in order to do their homework.</p>

<p>Data science education, homework, and code should be accessible first and foremost. This is not a nice-to-have; it is an ethical responsibility. If a student can run and see the results of their code, alter it, and test it – all without owning their own hardware or going through a complicated installation process – then we have removed a barrier that can otherwise demotivate and keep students from doing their work.</p>

<p>I have seen firsthand how a botched installation process can take the wind out of a student’s sails before they even write their first line of code. The focus should be on getting to work, not fussing with tooling. So, if you are convinced that this is a worthwhile pursuit, let’s look into the details of how to do this.</p>

<h2 id="the-setup-what-we-need">The Setup: What We Need</h2>

<p>I am teaching from Richard McElreath’s excellent <em>Statistical Rethinking</em> (second edition). Richard has an R package, <code class="language-plaintext highlighter-rouge">rethinking</code>, that runs on top of R and calls Stan in the background to perform Markov chain Monte Carlo calculations. My course notes are all written in Quarto. So, in order to have our environment ready to go, we need:</p>

<ol>
  <li>Base R</li>
  <li>Quarto</li>
  <li>Stan (compiled)</li>
  <li>The <code class="language-plaintext highlighter-rouge">rethinking</code> package</li>
  <li>VS Code extensions for R and Quarto</li>
</ol>

<p>This is a non-trivial installation process. But the good news is that we only have to do it once, and then every student benefits all semester.</p>

<h2 id="step-1-build-the-docker-image">Step 1: Build the Docker Image</h2>

<p>The idea is to create a Docker image that has everything installed and configured. A Dockerfile for an environment like this will typically start from a base image that includes R (such as <code class="language-plaintext highlighter-rouge">rocker/rstudio</code>), then layer on the installations of Quarto, CmdStan, and R packages. The details of compiling Stan are worth noting: CmdStan needs to be compiled in the image, and the <code class="language-plaintext highlighter-rouge">rethinking</code> package needs to be installed from GitHub since it is not on CRAN. This can take a while to build, but we only do it once.</p>

<p>Once you have your Dockerfile ready, you build the image, tag it for the GitHub Container Registry, and push it:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>docker build <span class="nt">-t</span> mat-209-gp06 <span class="nb">.</span>
docker tag mat-209-gp06 ghcr.io/washington-college/mat-209-gp06:latest
docker push ghcr.io/washington-college/mat-209-gp06:latest
</code></pre></div></div>

<p>Before pushing, you need to authenticate with the container registry. You can create a Personal Access Token with the <code class="language-plaintext highlighter-rouge">write:packages</code> scope at <a href="https://github.com/settings/tokens">github.com/settings/tokens</a>, then log in:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="nv">$YOUR_PAT</span> | docker login ghcr.io <span class="nt">-u</span> YOUR_GITHUB_USERNAME <span class="nt">--password-stdin</span>
</code></pre></div></div>

<p>Alternatively, if you build the image using a GitHub Actions workflow, the built-in <code class="language-plaintext highlighter-rouge">GITHUB_TOKEN</code> can handle authentication for you.</p>

<p>One more thing: by default, newly pushed packages are <strong>private</strong>. If you want Codespaces to be able to pull the image without additional authentication, go to the package settings on GitHub and change its visibility to public.</p>

<p>The image I built and pushed is hosted at:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ghcr.io/washington-college/mat-209-gp06:latest
</code></pre></div></div>

<h2 id="step-2-the-devcontainerjson-file">Step 2: The <code class="language-plaintext highlighter-rouge">devcontainer.json</code> File</h2>

<p>With the Docker image built and hosted, we need a <code class="language-plaintext highlighter-rouge">devcontainer.json</code> file that tells GitHub Codespaces how to set up the environment. This file lives in a <code class="language-plaintext highlighter-rouge">.devcontainer</code> directory in the repository. Here is the one I use:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"MAT 209 – GP 06 MCMC"</span><span class="p">,</span><span class="w">
  </span><span class="nl">"image"</span><span class="p">:</span><span class="w"> </span><span class="s2">"ghcr.io/washington-college/mat-209-gp06:latest"</span><span class="p">,</span><span class="w">

  </span><span class="nl">"remoteUser"</span><span class="p">:</span><span class="w"> </span><span class="s2">"rstudio"</span><span class="p">,</span><span class="w">

  </span><span class="nl">"customizations"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"vscode"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
      </span><span class="nl">"extensions"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
        </span><span class="s2">"REditorSupport.r"</span><span class="p">,</span><span class="w">
        </span><span class="s2">"REditorSupport.r-markdown"</span><span class="p">,</span><span class="w">
        </span><span class="s2">"quarto.quarto"</span><span class="w">
      </span><span class="p">],</span><span class="w">
      </span><span class="nl">"settings"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
        </span><span class="nl">"r.rterm.linux"</span><span class="p">:</span><span class="w"> </span><span class="s2">"/usr/local/bin/R"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"r.bracketedPaste"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
        </span><span class="nl">"r.plot.useHttpgd"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
        </span><span class="nl">"r.plot.defaults.plotWidth"</span><span class="p">:</span><span class="w"> </span><span class="mi">800</span><span class="p">,</span><span class="w">
        </span><span class="nl">"r.plot.defaults.plotHeight"</span><span class="p">:</span><span class="w"> </span><span class="mi">600</span><span class="w">
      </span><span class="p">}</span><span class="w">
    </span><span class="p">}</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Let’s walk through this. The <code class="language-plaintext highlighter-rouge">"image"</code> field points to the Docker image we built in Step 1. The <code class="language-plaintext highlighter-rouge">"remoteUser"</code> is set to <code class="language-plaintext highlighter-rouge">rstudio</code>, which is the default user in the rocker-based images. Under <code class="language-plaintext highlighter-rouge">"customizations"</code>, we install VS Code extensions for R, R Markdown, and Quarto so that students have syntax highlighting, code execution, and document rendering out of the box. The <code class="language-plaintext highlighter-rouge">"settings"</code> block configures R to use <code class="language-plaintext highlighter-rouge">httpgd</code> for plotting and sets sensible default plot dimensions.</p>

<p>That’s it. When a student opens this repository in a Codespace, the Docker image is pulled, the extensions are installed, and the environment is ready.</p>

<h2 id="step-3-prebuilds">Step 3: Prebuilds</h2>

<p>There is a catch. Pulling and initializing the Docker image can take upwards of 15 minutes, which is too long. In the spirit of our ethical responsibility to make things accessible, we can do better.</p>

<p>GitHub Codespaces supports <em>prebuilds</em>, which do the heavy computation of setting up the environment in advance. To set up a prebuild:</p>

<ol>
  <li>Go to the repository <strong>Settings</strong>.</li>
  <li>In the left sidebar, under <strong>Code and automation</strong>, click <strong>Codespaces</strong>.</li>
  <li>Click <strong>Set up prebuild</strong>.</li>
  <li>Select the branch and the dev container configuration.</li>
  <li>Restrict the region to where your students are located. All of my students access this from the US East region, so I restrict the prebuild to that region.</li>
  <li>Click <strong>Create</strong>.</li>
</ol>

<p>This kicks off a GitHub Action that takes about 15 minutes to run. Once it completes, when a student opens a Codespace, the environment is ready within three or so minutes. The heavy lifting has already been done.</p>

<h2 id="step-4-github-classroom">Step 4: GitHub Classroom</h2>

<p>With the dev container configured and the prebuild complete, the last step is to create the assignment in GitHub Classroom:</p>

<ol>
  <li>First, make your repository a <strong>template repository</strong>. Go to the repository <strong>Settings</strong>, and under <strong>General</strong>, check <strong>Template repository</strong>. This is required for GitHub Classroom to use it as starter code.</li>
  <li>Go to <strong>GitHub Classroom</strong> and create a new assignment.</li>
  <li>Select your template repository as the starter code.</li>
  <li>Choose <strong>GitHub Codespaces</strong> as the supported editor.</li>
  <li>I tend to make student repositories <strong>private</strong> to respect their privacy.</li>
  <li>I give students <strong>write access</strong> (not admin access) to their repos.</li>
</ol>

<p>That’s it. When a student accepts the assignment, they get their own copy of the repository, and clicking “Open in Codespace” gives them a fully configured environment with R, Quarto, Stan, and the <code class="language-plaintext highlighter-rouge">rethinking</code> package – ready to go.</p>

<h2 id="try-it-yourself">Try It Yourself</h2>

<p>If you want to see the end result from the student’s perspective, I have set up an assignment you can try: <a href="https://classroom.github.com/a/1vtNlby1">click here to accept the assignment</a>. When you click the link, you will be asked to select your name on a roster. Click <strong>Skip this step</strong>. Then, click the <strong>Open in GitHub Codespaces</strong> option. Within a few minutes, you should have a fully configured environment in your browser.</p>

<h2 id="it-can-be-simpler-than-this">It Can Be Simpler Than This</h2>

<p>I showed you a rather complicated case: compiling Stan and installing a package that is not on CRAN. For many projects, it will not be this complicated. If you are teaching a class where students need to code in Python, you can set up a virtual environment that automatically installs NumPy, Pandas, and Matplotlib. You could install the Microsoft Data Science tools for VS Code. The complexity can be as simple or as involved as your course requires, but the process is the same.</p>

<h2 id="why-this-matters">Why This Matters</h2>

<p>By running this setup once, my students are set up all semester to benefit from easy access to a coding environment where they can explore, alter, and run code. As data science instructors, how many times do we tell students to change a parameter and see what happens? In statistics, we might generate synthetic data with a mean of 50, then change it to 75, and check whether the model tracks the change. The only way to build that intuition is to actually change the code and run it yourself.</p>

<p>My view is that we should not necessarily ask our students to write everything from scratch, but instead to alter the code they see, to change what is necessary to accomplish a goal, and to diagnose when the code is not working. An accessible coding environment makes all of this possible, regardless of what hardware a student owns.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[AI disclosure: This post is based off a voice memo by the author that was subsequently edited using Claude Opus 4.6 and manual editing.]]></summary></entry><entry><title type="html">Comparing, Contrasting, and Unifying the Discrete Time and Continuous Time Fourier Transforms</title><link href="http://drspoulsen.github.io/Fourier/" rel="alternate" type="text/html" title="Comparing, Contrasting, and Unifying the Discrete Time and Continuous Time Fourier Transforms" /><published>2023-01-06T00:00:00+00:00</published><updated>2023-01-06T00:00:00+00:00</updated><id>http://drspoulsen.github.io/Fourier</id><content type="html" xml:base="http://drspoulsen.github.io/Fourier/"><![CDATA[<p>In doing research on the Fourier Transform, I have had some interesting insights that I would like to share here.</p>

<h2 id="transform-definitions">Transform Definitions</h2>

<h3 id="classical-definitions">Classical Definitions</h3>

<p>The (classical) Continuous Time Fourier Transform (cCTFT) \({\cal F}_{\mathbb{R}}\) of a function \(f:\mathbb{R} \rightarrow \mathbb{R}\) is given by</p>

\[{\cal X}_{\mathbb{R}}\{f\}(\omega) := \int_{-\infty}^{\infty} f(t) e^{-i \omega t} \; dt.\]

<p>The (classical) Discrete Time Fourier Transform (cDTFT) \({\cal F}_{\mathbb{Z}}\) of a function \(f:\mathbb{Z} \rightarrow \mathbb{R}\) is usually given by</p>

\[{\cal X}_{\mathbb{Z}}\{f\}(\omega) := \sum_{t=-\infty}^{\infty} f(t) e^{-i \omega t}.\]

<h3 id="moving-towards-a-unified-definition">Moving Towards a Unified Definition</h3>

<p>The cDTFT definition feels and looks correct, since it looks exactly like the continuous time definition, but with the integral replaced by a sum. I would tend to agree, but I study time scales, which seeks to unify the continuous and discrete under one analytic framework. One issue I have with this definition is that the domain of both transforms is the same: \(\omega\) can be any real number. In reality, the cCTFT and cDTFT can both be thought of as being evaluated on the cusp between the unstable region and the stable region of the Laplace Transform and the Z-Transform, respectively. This means the domain of the cCTFT is \(\mathbb{R}\), as we already have, but the domain of the cDTFT should really be the unit circle. Let’s make this more explicit by defining the cDTFT in terms of the variable \(e^{i \omega}\), which is always on the unit circle.</p>

\[{\cal X}_{\mathbb{Z}}\{f\}(e^{i \omega}) = \sum_{t=-\infty}^{\infty} f(t) \frac{1}{( e^{i \omega} )^t }.\]

<p>Now, the exponential function in the denominator of the integrand above is looking like the <a href="http://timescalewiki.org/index.php/Delta_exponential">delta exponential</a> on the integers, but without the shift of plus one. Define the new variable \(\xi = (e^{i \omega} -1)/i\). Notice since \(e^{i \omega}\) is always on the unit circle, \(\xi\) is always on the unit circle shifted left by one, then rotated by \(-\pi/2\) radians (this is the same as saying \(\xi\) is on the unit circle shifted up by one unit) This leads to the same cDTFT in terms the new variable \(\xi\),</p>

\[{\cal X}_{\mathbb{Z}}\{f\}(\xi) = \sum_{t=-\infty}^{\infty} f(t) \frac{1}{(1+i \xi)^{t}}.\]

<p>Notice nothing has really changed from the orginal cDTFT definition. I have simply recontextualized the definition and made one change of variable (notice \(1+ i \xi = e^{i \omega}\), so the denominator has remained the same).</p>

<p>Now, for reasons that will be revealed later, I actually want to alter the definition of the cCTFT and cDTFT to arrive at the (unified) Continuous Time Fourier Transform (uCTFT) and (unified) Discrete Time Fourier Transform (uDTFT)</p>

<h3 id="unified-continuous-time-fourier-transform-definition">Unified Continuous Time Fourier Transform Definition</h3>

<p>For the cCTFT, let \(\xi = \omega\) and define the uCTFT of \(f\), \({\cal F}_{\mathbb{R}}\), as</p>

\[{\cal F}_{\mathbb{R}}\{f\}(\xi) := {\cal X}_{\mathbb{R}}\{f\}(\xi).\]

<p>That is, the cCTFT and uCTFT are exactly the same, except I am renaming \(\omega\) as \(\xi\) (thrilling, I know).</p>

<h3 id="unified-discrete-time-fourier-transform-definition">Unified Discrete Time Fourier Transform Definition</h3>

<p>For the cDTFT, let \(\xi = (e^{i \omega} - 1)/i\) and define the uDTFT of \(f\), \({\cal F}_{\mathbb{Z}}\), as</p>

\[\begin{aligned}
{\cal F}_{\mathbb{Z}}\{f\}(\xi) &amp; := \frac{1}{1+i \xi} {\cal X}_{\mathbb{Z}}\{f\}(\xi) \\
                   &amp; = \sum_{t=-\infty}^{\infty} f(t) \frac{1}{(1+i \xi)^{t+1}}. 
\end{aligned}\]

<p>I have added in a forward shift to preserve operational properties.</p>

<h2 id="operational-properties">Operational Properties</h2>

<p>The differential operator of \(\mathbb{R}\) is the derivative. The differential operator I want to use on \(\mathbb{Z}\) is the forward difference \(\Delta f(t):= f(t+1)-f(t).\) There is a  key property that uCTFT and uDTFT share with respect to their respective differential operators.</p>

<h3 id="uctft-of-a-derivative">uCTFT of a Derivative</h3>

<p>Using integration by parts and the fact that acceptable signals must go to zero as time approaches infinity in either direction gives us</p>

\[\begin{aligned}
{\cal F}_{\mathbb{R}}\{f'\}(\xi) &amp; = \int_{-\infty}^{\infty} f'(t) e^{-i \xi t} \; dt. \\
                                 &amp; = f(t) e^{-i \xi t} \rvert_{-\infty}^{\infty} - \int_{-\infty}^{\infty} f(t) - i \xi e^{-i \xi t} \; dt \\
                                 &amp; = 0 + i \xi \int_{-\infty}^{\infty} f(t) e^{-i \xi t} \; dt \\
                                 &amp; = i \xi {\cal F}_{\mathbb{R}}\{f\}(\xi).
\end{aligned}\]

<h3 id="uctft-of-a-forward-difference">uCTFT of a Forward Difference</h3>

<p>Note that 
\(\begin{aligned}
\frac{1}{1+i \xi} ({\cal F}_{\mathbb{Z}}\{\Delta f\}(\xi) - i \xi {\cal F}_{\mathbb{Z}}\{f\}(\xi)) &amp;= \sum_{t=-\infty}^{\infty} \frac{(f(t+1) - f(t)) - i \xi f(t)}{(1+i \xi)^{t+2}} \\
&amp; = \sum_{t=-\infty}^{\infty} \frac{f(t+1)-(1+i \xi) f(t)}{(1+i \xi)^{t+2}} \\
&amp; = \sum_{t=-\infty}^{\infty} \frac{f(t+1)}{(1+i \xi)^{t+2}} - \frac{f(t)}{(1+i \xi)^{t+1}} \\
&amp; = \sum_{t=-\infty}^{\infty} \frac{f(t+1)}{(1+i \xi)^{t+2}} - \sum_{t=-\infty}^{\infty} \frac{f(t)}{(1+i \xi)^{t+1}} \\
&amp; = {\cal F}_{\mathbb{Z}}\{f\}(\xi) - {\cal F}_{\mathbb{Z}}\{f\}(\xi) \\
&amp; = 0
\end{aligned}\)</p>

<p>Thus \({\cal F}_{\mathbb{Z}}\{\Delta f\}(\xi) = i \xi {\cal F}_{\mathbb{Z}}\{f\}(\xi).\) This matches how the unified transform interacted with the differential operator on \(\mathbb{R}\)</p>

<p>Okay, I have to be honest here that we didn’t need the extra \((1+i \xi)\) in the denominator for this to work out on \(\mathbb{Z}\). But, the shift forward is absolutely essential when trying to make this work on arbitrary time domains.</p>

<h2 id="domains">Domains</h2>

<p>The uCTFT is defined on the real line. The uDTFT is defined on the unit circle shifted up by one, so it is tangent to the real axis and in the upper-half plane.</p>

<p>Let’s think about the time domain \(h \mathbb{Z} = \{...-3h, -2h, -h, 0 , h, 2h, 3h, ...\}.\) For \(s \in h \mathbb{Z}\), one can perform the change of variables \(t= s/h\) and perform a uDTFT. However, looking at the domain of this transformation in the orginal domain, we see the domain of the transform is the disc of radius \(1/h\) that is tangent to the real axis and in the upper-half plane. This lets us see a beautiful unity in our approach. As \(h \rightarrow 0\), the domain of the Fourier transform becomes a bigger and bigger circle, so big that the bottom part of the circle becomes almost a straight line – the real axis. This shows that the domain of the uDTFT becomes the domain of the cDTFT in the limit as \(h \rightarrow 0\) (which means we’re making a better and better discrete-time approximation of continuous time).</p>

<video width="100%" controls="" autoplay="" loop="" muted="">
  <source src="/videos/circle.mp4" type="video/mp4" />
  This browser does not display the video tag.
</video>

<h2 id="revisiting-the-domain">Revisiting the Domain</h2>

<p>Let’s think a little more deply about the domain of these Fourier Transforms. In order for the Transform to be well-defined, the kernels must be bounded in time as \(t \rightarrow \pm \infty\) (the kernels being the functions that \(f(t)\) is multiplied by in the Fourier Transform). If they were not, then the integrals/sums in the Fourier Transforms would diverge.</p>

<h3 id="continuous-time-kernel">Continuous Time Kernel</h3>

<p>The kernel of the uCTFT is \(K(t,\xi) = e^{-i \xi t}\). Fix \(\xi \in \mathbb{C}\). For which values of \(\xi\) does \(K(t,\xi)\) remain bounded as \(t \rightarrow \pm \infty\)? Well, as long as \(\xi \in \mathbb{R}\) then\(\lvert e^{-i \xi t} \rvert = 1\). However, if \(\xi \not \in \mathbb{R}\), then the modulus of \(K\) will grow arbitrarily large either as \(t \rightarrow \infty\) or as \(t \rightarrow -\infty\). Why? Assume \(\xi = a + bi\), where \(b&gt;0\), for example. Then \(\lvert e^{-i \xi t} \rvert = \lvert e^{-i(a+bi)t} \rvert= \lvert e^{-ait}e^{bt} \rvert = \lvert e^{bt} \rvert \rightarrow \infty\) as \(t \rightarrow \infty\).</p>

<h3 id="discrete-time-kernel">Discrete Time Kernel</h3>

<p>The kernel of the uDTFT is \(K(t,\xi) = 1/(1+i \xi)^{t+1}\). Fix \(\xi \in \mathbb{C}\). For which values of \(\xi\) does \(K(t,\xi)\) remain bounded as \(t \rightarrow \pm \infty\)? Well, we need \(1/(1+i \xi)^{t+1} = 1/\lvert 1+i \xi\rvert^{t+1}\) to be bounded. This requires the base of the exponential function to be one, so \(\lvert 1+i \xi \rvert=1\). Note that this is the equation for a circle of radius one centered at \(i\), which is exactly the unit circle shifted up by one that we have previously discussed.</p>

<h3 id="understanding-the-dft-in-this-new-light">Understanding the DFT in this new light.</h3>

<p>Pedagogically, I find it difficult to motivate the Discrete Fourier Transform (DFT). The DFT is obtained from the cDTFT by sampling \(\omega\) uniformly. To me, this feels like we are treating \(\omega\) as being on a line, and chopping it up into equally-sized pieces. But, we have seen the issue with thinking about \(\omega\) as being the variable in the cDTFT, in that it tends to cover up the fact that the domain is a circle and that the variable is really best thought of as \(e^{i \omega}\). I think our visualization of the domain can help.</p>

<p>For the cDTFT, the domain is a complete, continuous unit circle. For the DFT, the domain becomes points equally distributed about the circle, with \(e^{-i 0 t}=1\) as the anchor point.</p>

<p>If we label the zeroth point to be the anchor point, then proceed around the circle naming the first point, second point, third point, etc., then we recover the domain of the DFT, \(\{0,1,2,3,...,N\}\). This is acceptable in practice because the DFT is a sequence, but it perhaps obscures what is going on and can lead to confusion.</p>

<p><img src="/images/Fourier/DFT.png" alt="17 points equally distributed around a circle" /></p>

<p>In our unified view, the domain of the uDTFT is the unit circle shifted up one. The unified Discrete Fourier Transform would also be points equally distributed about the circle, but with the <em>origin</em> as the anchor point. That feels better!</p>

<h2 id="the-inverse-fourier-transform">The Inverse Fourier Transform</h2>

<p>I have always found the inverse cDTFT to be puzzling, since it is an integral in \(\omega\):</p>

\[{\cal X}_{\mathbb{R}}^{-1}{F}(t) = \frac{1}{2 \pi} \int_{-\pi}^{\pi} F(e^{i \omega}) e^{i \omega t} \; d \omega.\]

<p>The view that the domain of the cDTFT is actaully just a circle helps this integral make more sense. Really, the integral is a contour integral around the unit circle. The reason this integral looks like a real integral is that the unit circle has been parameterized and this form is a result of that parameterization.</p>

<h2 id="the-nyquist-frequency">The Nyquist Frequency</h2>

<p>If the sampling rate is given by \(v\), the so-called Nyquist frequency is given by \(v/2\). If a signal has all its frequencies below the Nyquist frequency, then the signal can be perfectly reconstructed from the cDTFT. What does this mean in our scenerio? If the sampling rate is \(v\) Hertz then then time scale is \(\frac{1}{v} \mathbb{Z}\) (and it means that the angular velocity is \(2 \pi v\)). Therefore the region that the uDTFT is defined over is a circle of radius \(v\) centered at \(v i\). Remember, we have the view that frequency \(\omega\) is mapped to a point \(\xi\) on this domain (one could write \(\xi(\omega)\) to emphasize that \(\xi\) is a function of \(\omega\)).</p>

<p>The Fourier Transform on this time scale is</p>

\[{\cal F}_{\frac{1}{v} \mathbb{Z}}\{f\}(\xi)  = \sum_{t=-\infty}^{\infty} f(t) \frac{1/v}{(1+i \xi/v)^{t+1}}.\]

<p>The relationship between \(\xi\) and \(\omega\) on this time scale is</p>

\[\xi(\omega) = \frac{v}{i} (e^{i \omega/v} -1),\]

<p>which again, can be thought as taking the unit circle, shifting it left by one, rotating it \(\pi/2\) radians to the left, then scaling up by factor of \(v\).</p>

<p>Notice that this relationship between \(\xi\) and \(\omega\) is \(2 \pi v\) periodic. One would think this would mean that there would not be overlap and confusion unless the frequency exceeded \(2 \pi v.\) However, there are symmetries (time reversal and conjugation) which essentially induce this overlap early - when the frequency exceeds \(\pi v\).</p>

<p>What I mean by overlap and confusion is that all frequencies that map to the same value of \(\xi\) will contribute to the Fourier transform evaluated at \(\xi\). This is called <em>aliasing.</em></p>

<p>Perhaps what I like about this viewpoint is that there are two frequencies at work here. There is the sampling rate, which determines the size of the circle (the radius), and the variable frequency \(\omega\) which determines the angle from the cetner of the circle. So, the two frequencies act as the two components of a polar representation of the domain.</p>

<p>On a general time scale, there are many more frequencies at work (the time between samples is a function of time. But, this seperation is still at play. The sampling <em>frequencies</em> determine the shape of the domain (they won’t be circles anymore!) and the variable frequency \(\omega\) determines the location on the domain.</p>

<h2 id="two-time-steps">Two time steps.</h2>

<p>Suppose that we have a signal that is sampled non-uniformly in time. For the sake of this example, let’s say the signal is sampled with a one-second gap, then a two-second gap, then a one-second gap, then a two-second gap, and so on. The sampling times form a time scale \(\mathbb{T}_{1,2} = \{\ldots, -7,-6,-4,-3,-1,0,1,3,4,6,7,ldots}.\)  The unified Fourier Transform my colleagues and I have developed becomes, in this instance</p>

\[\begin{aligned}
{\cal F}_{\mathbb{T}_{1,2}}\{f\}(\xi) &amp; = \ldots f(-7) (1+i \xi)^2 (1+2 i \xi)^2 + 2 f(-6) (1+i \xi)^2 (1+  2 i \xi) \\
&amp; + f(-4) (1+i \xi) (1+2 i \xi) + 2 f(-3) (1+i \xi) \\ 
&amp; + f(-1) + \frac{f(0)}{(1+ i \xi)} + 2 \frac{f(1)}{(1+i \xi)(1+2 i \xi)} \\
&amp; + \frac{f(3)}{(1+i \xi)^2 (1+2 i \xi)} + 2 \frac{f(4)}{(1+i \xi)^2 (1+2 i \xi)^2} + \ldots
\end{aligned}\]

<p>In order for the kernel to be bounded as \(t \rightarrow \pm \infty,\) it is neccesary and sufficient that</p>

\[|(1+i \xi)(1 + 2 i \xi)| = 1.\]

<p>The set of \(\xi \in \mathbb{C}\) for which this is true is shown below. This is the domain of this Fourier Transform. Note that the region is still tangent to the real axis and entirely in the upper-half plane.</p>

<p><img src="/images/Fourier/T_12.png" alt="the domain for the 1,2 time scale. The domain looks almost like an ellipse (but it is not an ellipse)." /></p>

<p>This region is in some sense the average of the unit circle tangent to the real-axis and the circle of radius 1/2 tangent to the real axis - ie/ the average of the domain of the Fourier Transform on \(\mathbb{Z}\) and the Fourier Transform on \(2 \mathbb{Z}\).</p>

<p>The mapping from frequency \(\omega\) to point in the complex plane \(\xi\) is complicated, even in this basic case. While the domain looks elliptical, it is not an ellipse. This mapping is, however, periodic with period \(4 \pi/3.\) This is again in agreement with the Nyquist frequency, as it has been proven that a signal can be reconstructed if the average sampling rate satisfies the Nyquist criterion and the signal has a finite bandwidth. Here, the average is two samples in three seconds, so \(v = 2/3\), and the Nyquist frequency should be \(2 \pi/3\), which is exactly half the peiod yet again!</p>

<h2 id="antarctic-ice-sheet-example">Antarctic Ice Sheet Example</h2>

<p>Consider the <a href="https://cdiac.ess-dive.lbl.gov/ftp/trends/co2/vostok.icecore.co2">Vostok Ice Core CO2 Data</a>, which shows CO2 concentrations over the past 419,000 years, sampled non-uniformly in time via an ice core sample. This gives us a signal \(f(t)\), where \(f\) is CO2 concentration, and \(t\) is defined on the time scale \(\mathbb{T} = \{-413649, -411959, -410653, -407331, -405523, -405523,\ldots\}\), measured in years, with \(t=0\) being the age of the ice at the last data point, 5679 years ago.</p>

<p>The time between samples is 1142.68 years on average, so we do expect the region where the Fourier Transform is defined to be relatively smaller than the regions we have encountered so far.</p>

<p><img src="/images/Fourier/vostok.png" alt="Region where the Fourier Transform is defined for the Vostok Ice Core CO2 Data. It is almost lemon-shaped." />
<em>Region where the Fourier Transform is defined for the Vostok Ice Core CO2 Data. It is almost lemon-shaped.</em></p>

<p>Again, we see that the sampling rate determine the domain of the Fourier Transform.</p>

<p>Below is the plot of the the signal versus time in years.</p>

<p><img src="/images/Fourier/data1period.png" alt="CO2 versus time" /></p>

<p>Just as with the Discrete Fourier Transform, in order to work with this signal effectively, we need to assume the signal is periodic. Moreover, our work assumes that the time domain is symmetric about the origin. Moreover, what is the time step between the last data point and the copy of the first data point when the signal repeats? We propose using the average time step in the signal, so as to not change the average and hence to not change the Nyquist frequency. Now the graph of the signal looks like the graph below, and continuing on in either direction to infinity.</p>

<p><img src="/images/Fourier/data4periods.png" alt="CO2 versus time" /></p>

<p>We are not saying that we know the CO2 levels 800,000 years into the future, we are just augmenting the signal to allow the theory to analyze the signal.</p>

<p>The Nyquist frequency for this example is \(\pi/1142.68\) radians per year, which corresponds to a period of 363.7 years. We should be able to detect patterns at this time scale and larger in the data.</p>

<h2 id="takeaways">Takeaways</h2>

<ul>
  <li>
    <p>When considering Fourier Transforms, there are two frequencies that play a role. The sampling rate and the target signal frequency (the variable \(\omega\) in the cCTFT).</p>
  </li>
  <li>
    <p>The sampling rate of a signal manifests in the shape of the domain of the Fourier Transform.</p>
  </li>
  <li>
    <p>The domain of the Fourier Transform is parameterized by the target signal frequency \(\omega.\) The Nyquist frequency manifests in the periodicity of this parameterization.</p>
  </li>
  <li>
    <p>The cCTFT and cDTFT are really two manifestations of the same process. A simple change of variables helps us to see how the cDTFT becomes the cCTFT as the sampling rate goes to infinity.</p>
  </li>
</ul>]]></content><author><name></name></author><summary type="html"><![CDATA[In doing research on the Fourier Transform, I have had some interesting insights that I would like to share here.]]></summary></entry><entry><title type="html">The Math of Shiny Hunting in Pokemon</title><link href="http://drspoulsen.github.io/Pokemon-Shiny-Hunting-Math/" rel="alternate" type="text/html" title="The Math of Shiny Hunting in Pokemon" /><published>2022-12-15T00:00:00+00:00</published><updated>2022-12-15T00:00:00+00:00</updated><id>http://drspoulsen.github.io/Pokemon-Shiny-Hunting-Math</id><content type="html" xml:base="http://drspoulsen.github.io/Pokemon-Shiny-Hunting-Math/"><![CDATA[<p>When I am not teaching or writing about mathematics, one of my hobbies is playing Pokemon games. In particular, I have been playing Pokemon Go for the past six years, and have recently started playing Pokemon Scarlet and Violet.</p>

<p>Pokemon Go has encouraged me to be active and to meet people both in my local community and around the world. I know people from my town that I wouldn’t otherwise know from the game. I have traveled to events and met people who I have befriended online. I find the games to be very relaxing, and a way to redirect my thoughts away from work.</p>

<p><img src="/images/pogoraidsfest.jpeg" alt="Group Photo from a meetup in Washington DC" />
<em>A Group Photo at a meetup in Washington DC for members of a Discord server I administrate. I am third from the left.</em></p>

<p><img src="/images/PoGoBaltimore.jpg" alt="Group Photo from a Pokemon Go Tournament" />
<em>A Group Photo at a Pokemon Go Tournament in Baltimore. I am in the middle in the yellow and blue shirt with the Ho-Oh on my shoulder. I know all these people from a Discord server that I administrate.</em></p>

<p>That said, I have a tendancy to look for the math in everything (although math is NOT everywhere, as I tell my History of Mathematics students). I have joked with colleagues that I could teach a whole introductory mathematics course on the math of Pokemon. One subject in that course would be probability theory via shiny hunting.</p>

<p>A shiny Pokemon is a differently-colored version of a Pokemon that appear randomly and very rarely. Although they are purely cosmetic, they are highly sought after trophies, with people dedicating hours, days, even months, trying to find a shiny Pokemon. So many Twitch and Youtube streams are dedicated to shiny hunting. So. Many.</p>

<p>Shiny Pokemon hunting is a wonderful lens through which to understand some of the most important concepts in probability theory: Independence, Bernoulli trials, the binomial distribution, the geometric distribution, the Poisson distribution, the exponential distribution, and the memoryless property. Let’s take each of these in turn.</p>

<h2 id="independence">Independence</h2>

<p>In Pokemon Scarlet and Violet,the probability that an individual Pokemon will be shiny is 1/4096. With certain methods in the game, this probability can be increased up to 1/512. Whether a given Pokemon is shiny or not does not depend on whether another Pokemon is shiny or not. In probability, we say that shininess is <em>independent</em> of Pokemon.</p>

<p>When calculating the probability that two <strong>independent</strong> things happen, one simply multiplies the probability that the first thing happens by the probability that the second thing happpens. For example, the probability of 1) flipping a fair coin and getting a head <strong>and</strong> 2) rolling a five on a fair six-sided dice is \((1/2) (1/6) = 1/12\).</p>

<h2 id="bernoulli-trial">Bernoulli Trial</h2>

<p>A Bernoulli trial is a fancy way of saying “flipping a weighted coin.” Technically, a Bernoulli trial is an experiment where “success” occurs with probability \(p\), and “failure” occurs with probability \(1-p\). There are no other options (here failure is an option).</p>

<p>Let’s recast this definition in terms of Pokemon. Checking whether one Pokemon is shiny (hereafter known as a <em>shiny check</em>) is a Bernoulli trial where “success” means the Pokemon is shiny, and “failure” means the Pokemon is not shiny. The value for \(p\) is \(1/4096\) (without any boosts).</p>

<h2 id="binomial-distribution">Binomial Distribution</h2>

<p>When hunting for shinies, people do not just check one Pokemon and call it a day. The name of the game is to check as many Pokemon as possible as quickly as possible. This means the Bernoulli trial is repeated, and each trial is independent. Let’s say a streamer was going to check 1000 Pokemon, then give up. They are curious in knowing the probability that they see zero shinies, one shiny, two shinies, and so on. The <em>binomial distribution</em> would sate their curiosity.</p>

<p>For \(p=4096\), the <em>probability mass function</em> of the binomial distribution is plotted below. On the \(x\)-axis is the number of shinies, and on the \(y\)-axis is the probability.</p>

<p><img src="/images/Binomial_4096.jpeg" alt="The Binomial Distribution for p equals 1 divided by 4096" /></p>

<p>The probability that the streamer fails the entire shiny hunt is almost \(80\%\), while the probability of getting one shiny is almost \(20\%\). There is a small probability of getting two (or more!) shinies.</p>

<p>Let’s break down one of these probabilities. If the streamer were to get one shiny, then they would have 999 failures and one success. The probability of this is \((1/4096)^{1} (4095/4096)^{999}\). But, this does not account for the order that the streamer finds a shiny. They could find the shiny on the first check, or the second check, or the third check, and so on until the 1000th check, which means there are 1000 different orders. The probability of getting exactly one shiny in 1000 checks is then</p>

<p>\(1000 (1/4096)^{1} (4095/4096)^{999} \approx 0.191...,\)
or, about 19.1%.</p>

<p>If the streamer were to use all the available methods (complete the Pokedex to get the shiny charm, go to a mass outbreak and clear sixty or more Pokemon, and make a level three sparkling power sandwich), and make \(p=1/512.44\) <sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> during the 1000 Pokemon hunt, the probability mass function for the binomial distribution would instead look like the graph below.</p>

<p><img src="/images/Binomial_512.jpeg" alt="The Binomial Distribution for p equals 1 divided by 512.44" /></p>

<p>The probability of failing the hunt (getting zero shinies) has decreased from nearly \(80\%\) to about \(14\%\). There is also a much higher probability of two or more shinies during the hunt. The extra effort to increase \(p\) seems to be worth it.</p>

<h3 id="side-note-about-the-151244-probability">Side Note About the 1/512.44 Probability</h3>

<p>The way the boosts in shiny rate work is that instead of doing one Bernoulli trial with \(p=1/4096\) to determine shininess, the game does more than one Bernoulli trial to determine shininess (with the number of trials determined by the boost). If any of these trials are successful, then the Pokemon is shiny. A shiny charm changes the number of Bernoulli trials to three, while the 1/512.44 rate comes from the number changing to eight trials. Why? Let’s work it out!</p>

<p>Let’s look at the binomial distribution with \(p=1/4096\) as before, but with \(8\) trials instead of \(1000\).</p>

<p><img src="/images/Binomial_4096_8.jpeg" alt="The Binomial Distribution for p equals 1 divided by 512.44" /></p>

<p>That looks like the probability of zero successes in eight rolls is one, but this is why graphs can be misleading. Let’s do the math.</p>

<p>The probability of zero successes in eight independent trials is (by the multiplication rule) \((4095/4096)^8 \approx 0.99804954\). So the probability of more than one success (and therefore of a shiny Pokemon being produced) is about \(p_{\text{shiny}} = 1-.99804954 \approx 0.00195146\). Converting this to a fraction with one in the numerator gives \(1/512.4376602...\), which the Bulbapedia <sup id="fnref:1:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup> rounds to \(1/512.44\).</p>

<h2 id="the-geometric-distribution">The Geometric Distribution</h2>

<p>Oftentimes, a shiny hunter will not be interested in multiple shinies of the same Pokemon. They will stop the hunt if and only if they find one shiny. Here, the shiny hunter is not interested in the number of shinies they get for a fixed number of checks, but instead is interested in the amount of checks until a shiny is found. The geometric distribution addresses this idea.</p>

<p>The geometric distribution gives the probability of number of Bernoulli trials until the first success for each possible number of trials. If a person is full-odds shiny hunting (\(p=1/4096\)), the geometric distribution for the number of checks until a shiny is found is shown below.</p>

<p><img src="/images/Geometric_4096.jpeg" alt="The Geometric Distribution for p equals 1 divided by 4096" /></p>

<p>First, notice how the \(x\)-axis is now “Number of Shiny Checks.” Also, The probability axis has very small numbers. This is because the probability (which adds to one) is spread out over many, many possibilities. In this context, it doesn’t make much sense to talk about the probability of it taking exactly, say, 4000 checks to get a shiny. Instead, it is more informative to to look at the probability that it take <strong>less than</strong>, say, 4000 checks to get the shiny. This idea is called the <em>cumulative distribution function.</em> For a geometric distribution with \(p=1/4096\), the cumulative distribution function has the graph plotted below.</p>

<p><img src="/images/Geometric_CDF_4096.jpeg" alt="The Geometric Distribution for p equals 1 divided by 4096" /></p>

<p>Some features I notice really quickly:</p>

<p>1) If a shiny hunter does 4096 checks, the shiny is not guaranteed. In fact, there is only about a 63% <sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup> chance that the shiny will appear in the first 4096 checks. Even worse, as the number of checks gets large, the function levels off but never actually equals one. This means a shiny is never guaranteed.</p>

<p>2) It takes about 2838 checks in order for there to be a 50% chance of getting a shiny. I think most people would guess 2048 checks.</p>

<p>Some other features are less apparent. One of my favorite facts is that if someone has already checked some shinies, the cumulative distribution function for how many <strong>more</strong> checks until they encounter a shiny has exactly the graph above. This is called the <em>memoryless property.</em> This should feel both obvious and strange. I like to think about coin flipping. If I flip five heads in a row, but I know the coin is fair, the probability that the next coin is heads has not changed from \(1/2\). The past results have no impact on the future in this regard. So, if someone has already checked 10,000 Pokemon for shininess and has come up empty-handed, the probability that they’ll find a shiny in the next 4096 checks is still only 63%. There’s no credit earned from the universe from those first 10,000 checks.</p>

<h2 id="the-poisson-distribution">The Poisson Distribution</h2>

<p>In Pokemon, the shiny checks, which are discrete events, still happen in continuous time. Instead of thinking about the number of shinies in a certain amount of checks, as in the Binomial Distribution, one could instead consider the number of shinies in a certain amount of time.</p>

<p>Let’s work with an example. Yesterday, while home sick, I set aside 30 minutes to shiny hunt Magnemite in a mass outbreak with \(p=1/512.44\). Using a technique known as “picnic resetting” I was able to shiny check 15 Magnemite in, on average, 30 seconds (yes, I kept track). This means in 30 minutes, I would do approximately 900 checks. If I had hours and hours to play, I might expect that I would get \(900/512.44 \approx 1.756\) shinies per 30 minute period. But, due to randomness, the amount I get in just one 30 minute period might be larger or smaller. The amount of shinies I get in a 30 minute period with an average amount of 1.756 can be modeled by a <em>Poisson Distribution</em>, a distribution that models the number of rare events that occur in a unit of time.</p>

<p>The probability mass function for the Poisson distribution with an average value of 1.756 is shown below.</p>

<p><img src="/images/Poisson_512.jpeg" alt="The Poisson Distribution with a mean of 1.756" /></p>

<p>We can compare this to a Binomial distribution with \(p=1/512.44\) over 900 shiny checks.</p>

<p><img src="/images/Binomial_512_900.jpeg" alt="The Binomial Distribution with p equal to 1 divided by 512.44 and 900 shiny checks" /></p>

<p>They look very similar, and this is hopefully not a surprise (they are modeling the same thing, after all)! This is a wonderful illustration of the <em>Poisson Paradigm</em>, which states that the Poisson Distribution and the Binomial Distribution are very similar when the probability of success is small and the number of trials is large. One advantage in using the Poisson distribution is that the probabilities are easier to calculate. Another advantage is that we have shifted our thinking from discrete events that exist outside of time thinking of these checks as existing in time.</p>

<h2 id="the-exponential-distribution">The Exponential Distribution</h2>

<p>The geometric distribution models the amount of shiny checks until the first success. Given the shift to continuous time in the Poisson Distribution, it makes sense to think about the amount of time until the first shiny appears. A big shift has occured here, since time is continuous, not discrete like the previous distributions. The name of the distribution that models the time between rare events with a given average number of events is the exponential distribution.</p>

<p>Continuing with the example in the previous section, I ask how long should it take to find a shiny Magnemite in a mass outbreak (remember that I only have 30 minutes)? I need a average, which I know is 1.756 per 30 minutes. But, if I want a plot where the \(x\)-axis is time in minutes, I should adjust this average to be 1.756/30 per minute. Below is the cumulative distribution function for an exponential distribution with average 1.756/30 versus time in minutes.</p>

<p><img src="/images/Exponential_512.jpeg" alt="The cumulative distribution function for an exponential distribution with average 1.756/30 versus time in minutes." /></p>

<p>According to this model, there is about an 82% chance that I will find the shiny in the thirty minutes I set aside for shiny hunting. If I give myself an hour, the probability increases to 97%. But, cruelly, if I don’t find one in the first 30 minutes, the probability that I find one in the next thirty minutes is only 82%, not 97%. Again, the universe doesn’t give credit for past effort in these regards. The memoryless property strikes again!</p>

<h2 id="the-memoryless-property">The Memoryless Property</h2>

<p>Both the Geometric Distribution and the Exponential Distribution have the memoryless property. In fact, they are the only two distributions that have this property.</p>

<h2 id="conclusion">Conclusion</h2>

<p>The Binomial Distribution and the Poisson Distribution are intricately linked, as are the Geometric Distribution and the Exponential Distribution. In fact, I wrote an academic paper that makes this link explicit, showing that they are the result of the same process on different time domains. <sup id="fnref:3" role="doc-noteref"><a href="#fn:3" class="footnote" rel="footnote">3</a></sup></p>

<p>Even the silly and recreational things in life can lead to interesting and deep ideas if pursued (that is perhaps the entire premise of this blog). That said, getting at deep and interesting ideas should not always be the goal in life. Sometimes, it’s just fun to kick back, relax, and look for differently colored digital monsters.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p><a href="https://bulbapedia.bulbagarden.net/wiki/Shiny_Pok%C3%A9mon">Bulbapedia Shiny Pokemon Page</a> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a> <a href="#fnref:1:1" class="reversefootnote" role="doc-backlink">&#8617;<sup>2</sup></a></p>
    </li>
    <li id="fn:2" role="doc-endnote">
      <p>[\(.63 \approx 1-\frac{1}{e}\), where \(e\) is Euler’s constant \(e \approx 2.71828...\) The fact that this shows up here is not a coincidence.] <a href="#fnref:2" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
    <li id="fn:3" role="doc-endnote">
      <p><a href="https://ieeexplore.ieee.org/document/5753775">The Poisson process and Associated Probability Distributions on Time Scales</a> <a href="#fnref:3" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><summary type="html"><![CDATA[When I am not teaching or writing about mathematics, one of my hobbies is playing Pokemon games. In particular, I have been playing Pokemon Go for the past six years, and have recently started playing Pokemon Scarlet and Violet.]]></summary></entry><entry><title type="html">Sharing (is) a Piece of Cake</title><link href="http://drspoulsen.github.io/Cake-Sharing/" rel="alternate" type="text/html" title="Sharing (is) a Piece of Cake" /><published>2022-11-27T00:00:00+00:00</published><updated>2022-11-27T00:00:00+00:00</updated><id>http://drspoulsen.github.io/Cake-Sharing</id><content type="html" xml:base="http://drspoulsen.github.io/Cake-Sharing/"><![CDATA[<p>When my wife and I get takeout from a local Italian restaurant, we like to order one slice of cake to split for dessert. The slice of cake is very tall, but very skinny, so it comes in a container laying on its side. The slice is thin enough that we do not want to split the cake by turning it upright and creating two equal slices. Instead, we keep the cake on its side and cut it in a way that creates a wedge as one piece, and the rest of the cake as the other piece.</p>

<p><strong>Where should the slice be made to result in an even split?</strong></p>

<p><img src="/images/real_cake.jpg" alt="A Slice of Cake" /></p>

<h2 id="quick-remark">Quick Remark</h2>

<p>As long as people have had to share limited resources, fair sharing has been a topic of interest. For sharing food, a common technique is the “I cut, you choose” method, where one person makes a cut that they deem fair, and the second person chooses which portion to take. This ensures both parties are happy with the result.</p>

<p>This particular problem is potentially deceptive, since it is hard to estimate volumes of different shapes. The analysis to follow can at least set a good baseline for an “I cut, you choose” strategy. That said, it does not account for frosting distribution, amongst other factors.</p>

<h2 id="get-out-your-protractor-and-ruler-were-having-cake">Get Out Your Protractor and Ruler, We’re Having Cake!</h2>

<p>It’s time to discuss mathematics, so by convention I wil switch to using “we” (as in, you and I, dear reader). First, we realize that the answer must depend on the angular size of the cake, since if the angle were extremely small, the place to make the cut would be nearly half the length of the slice.</p>

<p>So, we will model the slice of cake as a sector of a cylinder of radius one, with angle of \(\theta\). To simplify things more, we recognize that the volume of the slice is the area of the top of the slice times the height of the cake, so we can just consider a sector of a circle of radius one with angle \(\theta\) in the “standard position” for a triangle.</p>

<p>Now, we will cut the cake perpindicular to the \(x\)-axis, at some number \(x=c, 0&lt;c&lt;1\). This creates a right triangle with base \(c\) and height \(c \tan(\theta),\) since the tangent of an angle is the ratio of the opposite to the adjacent side of a triangle.</p>

<p><img src="/images/cake_general.png" alt="Diagram for theta equals pi divided by 4" /></p>

<p>The area of the wedge shape is \(A_{wedge} = \frac{1}{2} c^2 \tan(\theta).\) In order to share fairly, \(A_{\text{wedge}}\) must be half the the area of the entire slice, \(A_{\text{slice}}=\frac{1}{2} \theta.\) That is</p>

\[\frac{1}{2} c^2 \tan(\theta) = \frac{1}{4} \theta.\]

<p>Solving for \(c\), we find \(c=\sqrt{\frac{1}{2} \theta \cot(\theta)}\) (ignoring the negative solution, since it doesn’t make sense in the context of the problem).</p>

<h2 id="some-examples">Some Examples</h2>

<h3 id="theta--pi4">\(\theta = \pi/4\)</h3>

<p>According to our house rules, a slice is legally defined to be one eighth of the whole. Therefore, for a slice of cake, the cut should be made at</p>

\[c=\sqrt{\frac{1}{2} \left( \frac{\pi}{4} \right) \cot \left(\frac{\pi}{4} \right)} = \sqrt{ \frac{\pi}{8} } \approx 0.6266570686.\]

<p><img src="/images/cake_pi_4.png" alt="Diagram for theta equals pi divided by 4" /></p>

<h3 id="theta--pi6">\(\theta = \pi/6\)</h3>

<p>Restaurants are not bound by house rules, so another likely definition of a slice is one twelfth of the whole. In this case, the cut should be made at</p>

\[c=\sqrt{\frac{1}{2} \left( \frac{\pi}{6} \right) \cot \left(\frac{\pi}{6} \right)}  \approx 0.6733868435.\]

<p>This is a really nice result, since for any realistic application this means the cake should be cut to the 2/3 of its base length, which is easily estimatable.</p>

<p><img src="/images/cake_pi_6.png" alt="Diagram for theta equals pi divided by 4" /></p>

<h2 id="further-thoughts-on-sharing">Further thoughts on sharing</h2>

<p>Serious mathematics has been inspired by sharing cake. Steinhaus posed to his students the problem of whether there was a strategy like the “I cut, you choose” method for sharing a cake amongst \(n\) people. These students, Banach and Knaster (yes, the Banach of Banach spaces, Banach-Tarski paradox, Banach fixed-point theorem, etc) solved this problem <sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>, which will be the subject of a future blog post.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p><a href="https://www.jstor.org/stable/2974584">Martin L. Jones. “A Note on a Cake Cutting Algorithm of Banach and Knaster,” <em>The American Mathematical Monthly</em></a> <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name></name></author><summary type="html"><![CDATA[When my wife and I get takeout from a local Italian restaurant, we like to order one slice of cake to split for dessert. The slice of cake is very tall, but very skinny, so it comes in a container laying on its side. The slice is thin enough that we do not want to split the cake by turning it upright and creating two equal slices. Instead, we keep the cake on its side and cut it in a way that creates a wedge as one piece, and the rest of the cake as the other piece.]]></summary></entry><entry><title type="html">MARP for Mathematical Slides in Markdown</title><link href="http://drspoulsen.github.io/MARP/" rel="alternate" type="text/html" title="MARP for Mathematical Slides in Markdown" /><published>2022-11-21T00:00:00+00:00</published><updated>2022-11-21T00:00:00+00:00</updated><id>http://drspoulsen.github.io/MARP</id><content type="html" xml:base="http://drspoulsen.github.io/MARP/"><![CDATA[<p>I recently discovered <a href="http://marp.app/">MARP</a>, which allows the creation of slides from simple markdown. The slides can be exported as HTML, PDF, or Powerpoint once written in markdown.</p>

<p>I decided to write my slides for my colloquim talk about cutting an onion optimally (see also my blog post about this topic <a href="https://drspoulsen.github.io/Onion/">Onion Blog Post</a> ) with MARP to test it out. I am very happy with the results, especially being able to export the slides as html and embedding a YouTube video within the slides.</p>

<iframe src="https://drspoulsen.github.io/Onion_Marp/index.html" title="Onion Talk in MARP"></iframe>

<p>If you want to check out the code for the slides, please see <a href="https://raw.githubusercontent.com/drspoulsen/Onion_Marp/main/Onion_Markdown.md">my GitHub repository</a></p>

<p>If you would like me to give a colloquium talk about this, please contact me via email or leave a reply on Mastodon.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I recently discovered MARP, which allows the creation of slides from simple markdown. The slides can be exported as HTML, PDF, or Powerpoint once written in markdown.]]></summary></entry><entry><title type="html">A solution to the Onion Problem of J. Kenji López-Alt</title><link href="http://drspoulsen.github.io/Onion/" rel="alternate" type="text/html" title="A solution to the Onion Problem of J. Kenji López-Alt" /><published>2021-11-13T00:00:00+00:00</published><updated>2021-11-13T00:00:00+00:00</updated><id>http://drspoulsen.github.io/Onion</id><content type="html" xml:base="http://drspoulsen.github.io/Onion/"><![CDATA[<p><strong>Note: This blog post originally appeared on <a href="https://medium.com/p/c3c4ab22e67c">my Medium blog</a>. I am reproducing the article here.</strong></p>

<p>I first became interested in the the problem of cutting onions in a way to reduce the variance of the volumes of the slices at a gathering with friends. One of my friends and colleagues, Dr. Gabe Feinberg, also a mathematician, pointed me to the Youtube video below.</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/BMgLRD2v5w0?start=141" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>

<p>In the video, Chef Kenji López-Alt says he has a friend who is a mathematician, who claims that you should cut radially towards a point 60% of the radius below the center of the onion, and mentions that this might be related to the reciprocal of golden ratio, \(1/\phi = 0.61803398875...\)</p>

<p>I was intrigued by this, and even began cutting onions at home with this technique, just because it made me happy.</p>

<p><img src="/images/Onion/discord_onion.png" alt="A post in the Washington College Mathematics and Computer Science Department Discord where I show the results of cutting onions in this manner." /></p>

<p>Each time I cut an onion for dinner, my mind would wander. I would think about why this is true, and what techniques I could use to approach the problem. While this was meditative for me, these musings did not lead anywhere substantial over the span of two months.</p>

<p>Last weekend, my thoughts actually lead me towards a solution. Within two days I had found the “true onion constant”, which, spoiler alert, is not the reciprocal of the golden ratio. The depth to which you have to aim your knife for radial cuts depends on the number of layers. You can see this by thinking of how to cut an onion with one layer versus an onion with ten layers to keep the pieces as similar as possible. For one layer you would aim towards the center of the onion, but for ten layers you would aim somewhere below the center of the onion. To simplify matters, I therefore thought of an onion with infinitely many layers (or, as Gabe called it, “the great onion in the sky,” which I love). These kind of abstractions are common in mathematics, and make problems tractable. Once there are infinitely many layers, it makes sense to think of infinitely many cuts. This moves the problem into the realm of continuous mathematics, where calculus can be used to great effect.</p>

<p><img src="/images/Onion/discord_constant.png" alt="Discussion in a Discord Server. Dylan: I'm starting to envision a proof of this by taking the limit as the number of cuts goes to infinity, and looking at the infinitesimal volumes. Sort of like Jacobians. I'd really like to see if 1/phi comes out as the &quot;true&quot; best depth to cut towards. Gabe: Maybe there's some other onion constant" /></p>

<p><img src="/images/Onion/discord_onion_sky.png" alt="Discussion in a Discord Server. Dylan: The constant might also be a function of the number of layers. The true onion constant would then be in the limit as the number of layers goes to infinity... Gabe: Ah the great onion in the sky." /></p>

<p>Here’s the technical part of the post. You probably need to know multivariable calculus to follow from here. I’m going to switch to using “we” instead of “I” to match mathematical writing conventions, and to indicate that we (you, dear reader, and I) are walking down this mathematical path together.</p>

<p>First, we model the onion as half of a disc of radius one, with its center at the origin and existing entirely in the first two quadrants in a rectangular (Cartesian) coordinate system. This ignores a dimension, and perhaps also some geometry of actual onions (are cross sections actually circles?) but makes the problem tractable and is still a good approximation.</p>

<p>The insight that leads to a solution comes from the Jacobian. When we change from rectangular coordinates to polar coordinates in integration, small rectangular pieces of area \(dx dy\) are transformed into small pieces of area \(r dr d \theta\), where \(x = r \cos(\theta)\) and \(y = r \sin(\theta)\). The idea of the Jacobian applies to all changes in coordinate systems. We can calculate the Jacobian as</p>

\[J(r,\theta) = \frac{\partial x}{\partial r} \frac{\partial y}{\partial \theta} = r \cos^2(\theta) + r \sin^2(\theta) = r.\]

<p>Below is a diagram showing the change of coordinates and the Jacobian in this setting.</p>

<p><img src="/images/Onion/Onion.png" alt="Diagram illustrating the Jacobian" /></p>

<p>Notice that the coordinate system cuts the onion, much as usual grid lines cut the plane into rectangles in the Cartesian coordinate system. The radial part of the coordinate system cuts the onion radially (which of course nature does by default, but we need to model this mathematically), while the angular part of the coordinate system cuts the onion as our knife would if we were making straight cuts towards the center of the onion. Even though every piece of the onion is infinitely small (there are infinitely many layers, and infinitely many cuts) The Jacobian \(r dr d\theta\) gives a measure of how big the infinitely small pieces are relative to each other. Pieces near the center of the onion are smaller than pieces near the edge of the onion, as we can see that since \(r\) is smaller towards the center of the onion and larger towards the edge of the onion.</p>

<p>We can find the average value of the function \(f(r,\theta) = r\) over the part of the plane that defines the onion to find the average weight of the infinitesimal area, \(A\).</p>

\[\overline{A} = \frac{\int_{0}^{\pi/2} \int_{0}^{1} r \; dr \; d \theta}{\int_{0}^{\pi/2} \int_{0}^{1} 1 \; dr \; d \theta} = \frac{1}{2}\]

<p>Once we have the average, we can find the variance, \(\sigma^2\), of the weight of the infinitesimal area by calculating</p>

\[\sigma^2 = \frac{\int_{0}^{\pi/2} \int_{0}^{1} (r-\overline{A})^2 \; dr \; d \theta}{\int_{0}^{\pi/2} \int_{0}^{1} 1 \; dr \; d \theta} = \frac{\int_{0}^{\pi/2} \int_{0}^{1} (r-1/2)^2 \; dr \; d \theta}{\int_{0}^{\pi/2} \int_{0}^{1} 1 \; dr \; d \theta}=\frac{1}{12}\]

<p>The variance is a good measure of the uniformity of the pieces. If the variance is large, the pieces are not very uniform, and vice-versa.</p>

<p>The problem with this analysis, of course, is that we are cutting towards the center of the onion. We want to cut towards a point below the center of the onion. To accomplish this, we need a new coordinate system.</p>

<p>We make a coordinate system for cutting towards a point a distance \(h&gt;0\) below the center of the onion. In this coordinate system, we measure the angle \(\theta\) from the point \((0,-h)\), while we measure the radius from the origin \((0,0)\) (both points in the rectangular coordinate system). The radial part of the coordinate system cuts the onion radially from the origin as before, while the angular part of the coordinate system cuts the onion as our knife would if we were making straight cuts towards the point \((0,-h)\), below the onion.</p>

<p><img src="/images/Onion/Onion_4.png" alt="New Coordinate System" /></p>

<p>This coordinate system only works for the upper half plane, as there are now technically two points in the plane for a given point \((r,\theta)\). Luckily, our onion is entirely in the upper-half plane!</p>

<p>In this coordinate system, the region of the plane that we model as the onion is defined by \(0 \leq \theta \leq \arctan(1/h)\), and \(h \tan(\theta) \leq r  \leq 1\). Notice that we are using symmetry. Usually we would think of the onion as a half-onion in the upper half of the plane. But, since the left side of the onion is a mirror image of the right side of the onion, and therefore both sides would have the same variance in area, we can perform this analysis just in the first quadrant.</p>

<p>The relation between \((x,y)\) and \((r,\theta)\) is less clear. Given we know \(r\), \(h\), and \(\theta\), we can draw the following triangle, with a new variable \(c\) which represents the distance from the point \((0,-h)\) to a given point \((x,y)\) (both in the rectangular coordinate system).</p>

<p><img src="/images/Onion/Onion_7.png" alt="Illustrating c" /></p>

<p>First, using the law of cosines, we can calculate</p>

\[c=h \cos(\theta)+\sqrt{r^2-h^2\sin^2(\theta)}.\]

<p>Using this, we can find the relationship between \((x,y)\) and \((r,\theta)\) as</p>

\[\begin{aligned}
x &amp; = c \sin(\theta) \\
y &amp; = c \cos(\theta)-h
\end{aligned}\]

<p>From this, for a given depth \(h\), we can calculate the Jacobian as</p>

\[\begin{aligned}
\scriptscriptstyle J(r,\theta) =&amp; \scriptscriptstyle \frac{r \cos (\theta ) \left(\sin (\theta ) \left(-\frac{h^2 \sin (\theta ) \cos (\theta )}{\sqrt{r^2-h^2 \sin ^2(\theta )}}-h \sin (\theta )\right)+\cos (\theta ) \left(\sqrt{r^2-h^2 \sin ^2(\theta )}+h \cos (\theta )\right)\right)}{\sqrt{r^2-h^2 \sin ^2(\theta )}}\\
&amp; \scriptscriptstyle -\frac{r \sin (\theta ) \left(\cos (\theta ) \left(-\frac{h^2 \sin (\theta ) \cos (\theta )}{\sqrt{r^2-h^2 \sin ^2(\theta )}}-h \sin (\theta )\right)-\sin (\theta ) \left(\sqrt{r^2-h^2 \sin ^2(\theta )}+h \cos (\theta )\right)\right)}{\sqrt{r^2-h^2 \sin ^2(\theta )}}.
\end{aligned}\]

<p>This is, to put it mildly, complicated. Nevertheless, we have done fairly straightforward calculus computations to get here, which shows the power of making this problem continuous. Mimicking what we did before, given a depth \(h\), we can find the average weight of the infinitesimal area, \(A(h)\), by calculating the integral of the Jacobian over the onion region divided by the integral of 1 over the same region</p>

\[\begin{aligned}
\overline{A}(h) &amp;= \frac{\int_{0}^{\arctan(1/h)} \int_{h \tan(\theta)}^{1} J(r,\theta) \; dr \; d\theta}{\int_{0}^{\arctan(1/h)} \int_{h \tan(\theta)}^{1} 1 \; dr \; d \theta}.
\end{aligned}\]

<p>And the variance of the weight of the infinitesimal area, \(\sigma^2(h)\), is found by calculating the integral of the square of the Jacobian minus \(A(h)\) over the onion region divided by the integral of 1 over the same region</p>

\[\sigma^2(h) = \frac{\int_{0}^{\arctan(1/h)} \int_{h \tan(\theta)}^{1} (J(r,\theta)-\overline{A}(h))^2 \; dr \; d\theta}{\int_{0}^{\arctan(1/h)} \int_{h \tan(\theta)}^{1} \; dr \; d \theta}.\]

<p>Yikes! Integrating this by hand looks really difficult, if not impossible. We should use a computer to help us. Using the power of numerical integration in Mathematica, we can plot the variance versus h, the depth of the point we are cutting towards.</p>

<p><img src="/images/Onion/numerical.png" alt="Plot of the variance versus h" /></p>

<p>We can see the minimum variance is around \(h=.55\). We can use a numerical minimization technique to find the \(h\) that minimizes the variance.</p>

<p>I am only confident of this number to 7 decimal points, but the “true onion constant” for the “onion in the sky” is given by 0.5573066…</p>

<p>To get the most even cuts of an onion by making radial cuts, one should aim towards a point 55.73066% the radius of the onion below the center. This is close, but different from, the 61.803% suggested in the Youtube video at the top. Also, this number will be different for onions for finitely many layers (that is to say, all onions). Nevertheless, I find this answer to be beautiful, and I will forever treasure the true onion constant.</p>

<p>I think it would be interesting to consider the effect of the number of layers on this answer. Since with one layer the best strategy is to cut towards the center, I suspect that the best depth \(h\) to cut towards increases from zero with one layer, with 0.5573066… as the upper bound on the depth. So, the best depth for an onion with ten layers would be somewhere between 0 and 0.5573066. I have not investigated this in depth, but this seems like a fun next step.</p>

<p>I hope we all now know enough about onions to object.</p>

<p><img src="/images/Onion/exo.png" alt="Exo Comics 685" />
<em><a href="https://www.exocomics.com/685/">Exo Comics 685</a></em></p>

<p>Update: I actually was able to evaluate \(\sigma^2(h)\) in a closed form. The techniques used to do it are really fun, and I am hoping to write them up for a recreational mathematics journal.</p>

<p>As calculus students know, if you want to minimize a function, you should take the derivative and set it equal to zero. Here, the derivative of \(\sigma^2(h)\) is given by</p>

\[[\sigma^2(h)]' = \frac{k(h)}{48 \left(\cot ^{-1}(h)-\frac{1}{2} h \log \left(\frac{1}{h^2}+1\right)\right)^3},\]

<p>where</p>

\[\begin{aligned}
  &amp;\scriptscriptstyle k(h)= \scriptscriptstyle -3 \pi ^2 \log \left(\frac{1}{h^2}+1\right) \\
    &amp; \scriptscriptstyle + 6 \left(h \log \left(\frac{1}{h^2}+1\right)-2 \cot ^{-1}(h)\right)^2 \left(h \left(h \log \left(4 h^2\right)+4 \sqrt{1-h^2} \left(\tan ^{-1}\left(\frac{h+1}{\sqrt{1-h^2}}\right)-\sin ^{-1}(h)\right)\right)+1\right)\\
    &amp; \scriptscriptstyle -2 \log \left(\frac{1}{h^2}+1\right) \left(h \log \left(\frac{1}{h^2}+1\right)-2 \cot ^{-1}(h)\right) \\
    &amp; \scriptscriptstyle \times \left(4 \left(1-h^2\right)^{3/2} \sin ^{-1}(h)-4 \left(1-h^2\right)^{3/2} \tan ^{-1}\left(\frac{h+1}{\sqrt{1-h^2}}\right)+h^3 \log \left(4 h^2\right)+h+2 \pi \right).
\end{aligned}\]

<p>The unique root of the above expression in the interval \((0,1)\) is the onion constant, since it is a critical point for the function \(\sigma^2(h)\) and the sign of \([\sigma^2(h)]'\) changes from negative to positive at this point, as seen in the graph of \([\sigma^2(h)]'\) below.</p>

<p><img src="/images/Onion/Sigma.png" alt="graph of derivative of variance versus h" /></p>

<p>With this, I can calculate the onion constant to arbitrary precision. Here it is to 1000 decimal places:</p>

<p>0.55730669298566447885109305914592718083200030207273275933982921319 4698135127210458697529556348892779238421515729764144366026144985585 4165046873271472618959107816152780606384065758548635804885244580180 0007394442805906736214054844087432881741438971785006588976790490992 3546045053996637979358236569783223477190862479127621607686248472908 3731336235000704236891376747519710815301807822317779086701048122723 0239150930543232987021503400654503271867566236420521560986469125085 8159370220537524022076834487502663198536347064463252552885622069125 8227307037720900190873707797080215945078389222941122441664099620992 6654693052663485088353188368234518499463417515539540122160704233743 5539919306999218795184234750992607153483541905867849402571200687099 2663407278202945110198402208378584410140122892631419360798953694134 2227610384234804380488890547391245831871629728678785899984149264095 1979084439023291773013425234306472822863355983488650721455375797473 6357343027167265972675903577598983959532796594227162648681839040…</p>

<p>Such a beautiful mathematical constant deserves a name. I choose to use the Hebrew character samekh, because it looks particularly like an onion.</p>

<p><img src="/images/Onion/onionpic2.png" alt="The Onion Constant Logo" /></p>]]></content><author><name></name></author><summary type="html"><![CDATA[Note: This blog post originally appeared on my Medium blog. I am reproducing the article here.]]></summary></entry></feed>