Jekyll2023-02-27T14:49:55+00:00https://duetosymmetry.com/Leo C. SteinAssistant Professor @ U of MS. Specializing in gravity and general relativity.Leo C. Steinlcstein@olemiss.eduNonlinear ringdown in the news2023-02-22T00:00:00+00:002023-02-22T00:00:00+00:00https://duetosymmetry.com/news/Nonlinear-paper-press<p class="align-right" style="width: 400px"><img src="https://duetosymmetry.com/images/nonlinear-qnm-cartoon.png" alt="" /></p>
<p>Our latest paper, <a href="/pubs/Nonlinear-BH-ringdown/">Nonlinearities in black hole ringdowns</a>, was just <a href="https://doi.org/10.1103/PhysRevLett.130.081402">published in
Physical Review
Letters</a>!
This article was selected as an ❦ Editors’ Suggestion, and <a href="https://physics.aps.org/articles/v16/29">Featured in
APS’s Physics magazine</a>.
There were a number of other news stories covering this work:</p>
<ul>
<li>University of Mississippi: <a href="https://news.olemiss.edu/black-hole-researchers-make-progress-in-gravitational-wave-research/">Black Hole Researchers Make Progress in
Gravitational Wave
Research</a></li>
<li>Caltech: <a href="https://www.caltech.edu/about/news/physicists-create-new-model-of-ringing-black-holes">Physicists Create New Model of Ringing Black
Holes</a></li>
<li>Columbia University: <a href="https://news.columbia.edu/news/new-model-better-understand-whats-inside-colliding-black-holes">A New Model to Better Understand What’s Inside
Colliding Black
Holes</a></li>
<li>Johns Hopkins University: <a href="https://hub.jhu.edu/2023/02/22/hopkins-scientists-simulate-black-hole-collision/">Simulations show aftermath of black hole
collision</a></li>
<li>Keefe’s <a href="https://www.youtube.com/watch?v=4IJAf4UTwbA">60 Second Science: Keefe Mitman on Black Hole
Mergers</a></li>
</ul>Leo C. Steinlcstein@olemiss.eduOur latest paper, Nonlinearities in black hole ringdowns, was just published in Physical Review Letters! This article was selected as an ❦ Editors’ Suggestion, and Featured in APS’s Physics magazine. There were a number of other news stories covering this work: University of Mississippi: Black Hole Researchers Make Progress in Gravitational Wave Research Caltech: Physicists Create New Model of Ringing Black Holes Columbia University: A New Model to Better Understand What’s Inside Colliding Black Holes Johns Hopkins University: Simulations show aftermath of black hole collision Keefe’s 60 Second Science: Keefe Mitman on Black Hole MergersSloan Research Fellowship2023-02-15T15:00:00+00:002023-02-15T15:00:00+00:00https://duetosymmetry.com/news/Sloan-fellowship<p class="align-right" style="width: 250px"><img src="https://duetosymmetry.com/images/sloan-research-fellowships-2023-facebook.png" alt="" /></p>
<p>I am honored to have been selected as one of this year’s Sloan
Research Fellows! The Alfred P. Sloan Foundation awards these
fellowship annually (since 1955) through a highly competitive
application process. <a href="https://sloan.org/fellowships/2023-Fellows">This year, 126 early-career researchers were
named Sloan Fellows</a>. You
can see in Sloan’s <a href="https://sloan.org/fellows-database">fellows
database</a> the good company I’m in
(including undergrad friend <a href="https://chemistry.northwestern.edu/people/core-faculty/profiles/todd-gingrich.html">Todd
Gingrich</a>,
and grad school friend <a href="https://live-sas-physics.pantheon.sas.upenn.edu/people/standing-faculty/robyn-sanderson">Robyn
Sanderson</a>).
The Foundation writes (emphasis mine):</p>
<blockquote>
<p>[T]he Sloan Research Fellowships are one of the most competitive and
prestigious awards available to early-career researchers. They are
also often seen as a marker of the quality of an institution’s
science faculty and proof of an institution’s success in attracting
the most promising junior researchers to its ranks. <strong>This year
marks the first time a faculty member from University of Mississippi
has received a Sloan Research Fellowship</strong>—we want to extend our
congratulations and hope you’re as excited as we are!</p>
</blockquote>
<p>Thanks to the Sloan Foundation, my nominator, those who wrote letters
of support, and to all my colleagues who supported me over the years!</p>Leo C. Steinlcstein@olemiss.eduI am honored to have been selected as one of this year's Sloan Research Fellows!Nonlinearities in black hole ringdowns2022-08-17T00:00:00+00:002022-08-17T00:00:00+00:00https://duetosymmetry.com/pubs/Nonlinear-BH-ringdown<p>This article was selected as an ❦ Editors’ Suggestion, and <a href="https://physics.aps.org/articles/v16/29">Featured in
APS’s Physics magazine</a>.
<a href="/news/Nonlinear-paper-press/">More press coverage links here</a>.</p>
<p class="align-right" style="width: 400px"><img src="https://duetosymmetry.com/images/amp_vs_amp_both_sets_4panels.png" alt="" /></p>
<blockquote>
<p>The gravitational wave strain emitted by a perturbed black hole (BH)
ringing down is typically modeled analytically using first-order BH
perturbation theory. In this Letter, we show that second-order
effects are necessary for modeling ringdowns from BH merger
simulations. Focusing on the strain’s (ℓ,m)=(4,4) angular harmonic,
we show the presence of a quadratic effect across a range of binary
BH mass ratios that agrees with theoretical expectations. We find
that the quadratic (4, 4) mode’s amplitude exhibits quadratic
scaling with the fundamental (2, 2) mode—its parent mode. The
nonlinear mode’s amplitude is comparable to or even larger than that
of the linear (4, 4) mode. Therefore, correctly modeling the
ringdown of higher harmonics—improving mode mismatches by up to 2
orders of magnitude—requires the inclusion of nonlinear effects.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduThis article was selected as an ❦ Editors’ Suggestion, and Featured in APS’s Physics magazine. More press coverage links here.Fixing the BMS Frame of Numerical Relativity Waveforms with BMS Charges2022-08-10T00:00:00+00:002022-08-10T00:00:00+00:00https://duetosymmetry.com/pubs/BMS-fixing-charges<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/Comparing_EXT_CCE_errs.png" alt="" /></p>
<blockquote>
<p>The Bondi-van der Burg-Metzner-Sachs (BMS) group, which uniquely
describes the symmetries of asymptotic infinity and therefore of the
gravitational waves that propagate there, has become increasingly
important for accurate modeling of waveforms. In particular,
waveform models, such as post-Newtonian (PN) expressions, numerical
relativity (NR), and black hole perturbation theory, produce results
that are in different BMS frames. Consequently, to build a model for
the waveforms produced during the merging of compact objects, which
ideally would be a hybridization of PN, NR, and black hole
perturbation theory, one needs a fast and robust method for fixing
the BMS freedoms. In this work, we present the first means of fixing
the entire BMS freedom of NR waveforms to match the frame of either
PN waveforms or black hole perturbation theory. We achieve this by
finding the BMS transformations that change certain charges in a
prescribed way — e.g., finding the center-of-mass transformation
that maps the center-of-mass charge to a mean of zero. We find that
this new method is 20 times faster, and more correct when mapping to
the superrest frame, than previous methods that relied on
optimization algorithms. Furthermore, in the course of developing
this charge-based frame fixing method, we compute the PN expression
for the Moreschi supermomentum to 3PN order without spins and 2PN
order with spins. This Moreschi supermomentum is effectively
equivalent to the energy flux or the null memory contribution at
future null infinity ℐ⁺. From this PN calculation, we also compute
oscillatory (m≠0 modes) and spin-dependent memory terms that have
not been identified previously or have been missing from strain
expressions in the post-Newtonian literature.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduThe Bondi-van der Burg-Metzner-Sachs (BMS) group, which uniquely describes the symmetries of asymptotic infinity and therefore of the gravitational waves that propagate there, has become increasingly important for accurate modeling of waveforms. In particular, waveform models, such as post-Newtonian (PN) expressions, numerical relativity (NR), and black hole perturbation theory, produce results that are in different BMS frames. Consequently, to build a model for the waveforms produced during the merging of compact objects, which ideally would be a hybridization of PN, NR, and black hole perturbation theory, one needs a fast and robust method for fixing the BMS freedoms. In this work, we present the first means of fixing the entire BMS freedom of NR waveforms to match the frame of either PN waveforms or black hole perturbation theory. We achieve this by finding the BMS transformations that change certain charges in a prescribed way — e.g., finding the center-of-mass transformation that maps the center-of-mass charge to a mean of zero. We find that this new method is 20 times faster, and more correct when mapping to the superrest frame, than previous methods that relied on optimization algorithms. Furthermore, in the course of developing this charge-based frame fixing method, we compute the PN expression for the Moreschi supermomentum to 3PN order without spins and 2PN order with spins. This Moreschi supermomentum is effectively equivalent to the energy flux or the null memory contribution at future null infinity ℐ⁺. From this PN calculation, we also compute oscillatory (m≠0 modes) and spin-dependent memory terms that have not been identified previously or have been missing from strain expressions in the post-Newtonian literature.Gravitational-wave energy and other fluxes in ghost-free bigravity2022-08-03T00:00:00+00:002022-08-03T00:00:00+00:00https://duetosymmetry.com/pubs/bigravity-energy<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/radiation-cylinder.png" alt="" /></p>
<blockquote>
<p>One of the key ingredients for making binary waveform predictions in
a beyond-GR theory of gravity is understanding the energy and
angular momentum carried by gravitational waves and any other
radiated fields. Identifying the appropriate energy functional is
unclear in Hassan-Rosen bigravity, a ghost-free theory with one
massive and one massless graviton. The difficulty arises from the
new degrees of freedom and length scales which are not present in
GR, rendering an Isaacson-style averaging calculation ambiguous. In
this article we compute the energy carried by gravitational waves in
bigravity starting from the action, using the canonical current
formalism. The canonical current agrees with other common energy
calculations in GR, and is unambiguous (modulo boundary terms),
making it a convenient choice for quantifying the energy of
gravitational waves in bigravity or any diffeomorphism-invariant
theories of gravity. This calculation opens the door for future
waveform modeling in bigravity to correctly include backreaction due
to emission of gravitational waves.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduOne of the key ingredients for making binary waveform predictions in a beyond-GR theory of gravity is understanding the energy and angular momentum carried by gravitational waves and any other radiated fields. Identifying the appropriate energy functional is unclear in Hassan-Rosen bigravity, a ghost-free theory with one massive and one massless graviton. The difficulty arises from the new degrees of freedom and length scales which are not present in GR, rendering an Isaacson-style averaging calculation ambiguous. In this article we compute the energy carried by gravitational waves in bigravity starting from the action, using the canonical current formalism. The canonical current agrees with other common energy calculations in GR, and is unambiguous (modulo boundary terms), making it a convenient choice for quantifying the energy of gravitational waves in bigravity or any diffeomorphism-invariant theories of gravity. This calculation opens the door for future waveform modeling in bigravity to correctly include backreaction due to emission of gravitational waves.Tidally-induced nonlinear resonances in EMRIs with an analogue model2022-03-18T00:00:00+00:002022-03-18T00:00:00+00:00https://duetosymmetry.com/pubs/EMRI-tidal-resonance<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/poincare_4d.png" alt="" /></p>
<blockquote>
<p>One of the important classes of targets for the future space-based
gravitational wave observatory LISA is extreme mass ratio inspirals
(EMRIs), where long and accurate waveform modeling is necessary for
detection and characterization. When modeling the dynamics of an
EMRI, several effects need to be included, such as the modifications
caused by an external tidal field. The effects of such perturbations
will generally break integrability at resonance, and can produce
significant dephasing from an unperturbed system. In this paper, we
use a Newtonian analogue of a Kerr black hole to study the effect of
an external tidal field on the dynamics and the gravitational
waveform. We have developed a numerical framework that takes
advantage of the integrability of the background system to evolve it
with a symplectic splitting integrator, and compute approximate
gravitational waveforms to estimate the time scale over which the
perturbation affects the dynamics. We find that different entry
points into the resonance in phase-space can produce substantially
different dynamics. Finally, by comparing this time scale with the
inspiral time, we find tidal effects will need to be included when
modeling EMRI gravitational waves when <script type="math/tex">\varepsilon \gtrsim 300
q^2</script>, where <script type="math/tex">q</script> is the small mass ratio, and <script type="math/tex">\varepsilon</script>
measures the strength of the external tidal field.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduOne of the important classes of targets for the future space-based gravitational wave observatory LISA is extreme mass ratio inspirals (EMRIs), where long and accurate waveform modeling is necessary for detection and characterization. When modeling the dynamics of an EMRI, several effects need to be included, such as the modifications caused by an external tidal field. The effects of such perturbations will generally break integrability at resonance, and can produce significant dephasing from an unperturbed system. In this paper, we use a Newtonian analogue of a Kerr black hole to study the effect of an external tidal field on the dynamics and the gravitational waveform. We have developed a numerical framework that takes advantage of the integrability of the background system to evolve it with a symplectic splitting integrator, and compute approximate gravitational waveforms to estimate the time scale over which the perturbation affects the dynamics. We find that different entry points into the resonance in phase-space can produce substantially different dynamics. Finally, by comparing this time scale with the inspiral time, we find tidal effects will need to be included when modeling EMRI gravitational waves when , where is the small mass ratio, and measures the strength of the external tidal field.High Precision Ringdown Modeling: Multimode Fits and BMS Frames2021-10-31T00:00:00+00:002021-10-31T00:00:00+00:00https://duetosymmetry.com/pubs/high-precision-ringdown<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/wrong-right-BMS-frame.png" alt="" /></p>
<blockquote>
<p>Quasi-normal mode (QNM) modeling is an invaluable tool for
characterizing remnant black holes, studying strong gravity, and
testing general relativity. Only recently have QNM studies begun to
focus on multimode fitting to numerical relativity strain waveforms.
As gravitational wave observatories become even more sensitive they
will be able to resolve higher-order modes. Consequently, multimode
QNM fits will be critically important, and in turn require a more
thorough treatment of the asymptotic frame at ℐ⁺. The first main
result of this work is a method for systematically fitting a QNM
model containing many modes to a numerical waveform produced using
Cauchy-characteristic extraction (CCE), a waveform extraction
technique which is known to resolve memory effects. We choose the
modes to model based on their power contribution to the residual
between numerical and model waveforms. We show that the all-mode
strain mismatch improves by a factor of ~10⁵ when using multimode
fitting as opposed to only fitting the (2, ±2,n) modes. Our most
significant result addresses a critical point that has been
overlooked in the QNM literature: the importance of matching the
Bondi-van der Burg-Metzner-Sachs (BMS) frame of the numerical
waveform to that of the QNM model. We show that by mapping the
numerical waveforms—which exhibit the memory effect—to a BMS frame
known as the super rest frame, there is an improvement of ~10⁵ in
the all-mode strain mismatch compared to using a strain waveform
whose BMS frame is not fixed. Furthermore, we find that by mapping
CCE waveforms to the super rest frame, we can obtain all-mode
mismatches that are, on average, a factor of ~4 better than using
the publicly-available extrapolated waveforms. We illustrate the
effectiveness of these modeling enhancements by applying them to
families of waveforms produced by numerical relativity and comparing
our results to previous QNM studies.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduQuasi-normal mode (QNM) modeling is an invaluable tool for characterizing remnant black holes, studying strong gravity, and testing general relativity. Only recently have QNM studies begun to focus on multimode fitting to numerical relativity strain waveforms. As gravitational wave observatories become even more sensitive they will be able to resolve higher-order modes. Consequently, multimode QNM fits will be critically important, and in turn require a more thorough treatment of the asymptotic frame at ℐ⁺. The first main result of this work is a method for systematically fitting a QNM model containing many modes to a numerical waveform produced using Cauchy-characteristic extraction (CCE), a waveform extraction technique which is known to resolve memory effects. We choose the modes to model based on their power contribution to the residual between numerical and model waveforms. We show that the all-mode strain mismatch improves by a factor of ~10⁵ when using multimode fitting as opposed to only fitting the (2, ±2,n) modes. Our most significant result addresses a critical point that has been overlooked in the QNM literature: the importance of matching the Bondi-van der Burg-Metzner-Sachs (BMS) frame of the numerical waveform to that of the QNM model. We show that by mapping the numerical waveforms—which exhibit the memory effect—to a BMS frame known as the super rest frame, there is an improvement of ~10⁵ in the all-mode strain mismatch compared to using a strain waveform whose BMS frame is not fixed. Furthermore, we find that by mapping CCE waveforms to the super rest frame, we can obtain all-mode mismatches that are, on average, a factor of ~4 better than using the publicly-available extrapolated waveforms. We illustrate the effectiveness of these modeling enhancements by applying them to families of waveforms produced by numerical relativity and comparing our results to previous QNM studies.Action-angle variables of a binary black-hole with arbitrary eccentricity, spins, and masses at 1.5 post-Newtonian order2021-10-29T00:00:00+00:002021-10-29T00:00:00+00:00https://duetosymmetry.com/pubs/Action-angle-vars-PN<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/PN_loops.png" alt="" /></p>
<blockquote>
<p>Accurate and efficient modeling of the dynamics of binary black
holes (BBHs) is crucial to their detection through gravitational
waves (GWs), with LIGO/Virgo/KAGRA, and LISA in the future. Solving
the dynamics of a BBH system with arbitrary parameters without
simplifications (like orbit- or precession-averaging) in closed-form
is one of the most challenging problems for the GW community. One
potential approach is using canonical perturbation theory which
constructs perturbed action-angle variables from the unperturbed
ones of an integrable Hamiltonian system. Having action-angle
variables of the integrable 1.5 post-Newtonian (PN) BBH system is
therefore imperative. In this paper, we continue the work initiated
by two of us in
<a href="https://arxiv.org/abs/2012.06586">arXiv:2012.06586</a>, where we
presented four out of five actions of a BBH system with arbitrary
eccentricity, masses, and spins, at 1.5PN order. Here we compute the
remaining fifth action using a novel method of extending the phase
space by introducing unmeasurable phase space coordinates. We detail
how to compute all the frequencies, and sketch how to explicitly
transform to angle variables, which analytically solves the dynamics
at 1.5PN. This lays the groundwork to analytically solve the
conservative dynamics of the BBH system with arbitrary masses,
spins, and eccentricity, at higher PN order, by using canonical
perturbation theory.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduAccurate and efficient modeling of the dynamics of binary black holes (BBHs) is crucial to their detection through gravitational waves (GWs), with LIGO/Virgo/KAGRA, and LISA in the future. Solving the dynamics of a BBH system with arbitrary parameters without simplifications (like orbit- or precession-averaging) in closed-form is one of the most challenging problems for the GW community. One potential approach is using canonical perturbation theory which constructs perturbed action-angle variables from the unperturbed ones of an integrable Hamiltonian system. Having action-angle variables of the integrable 1.5 post-Newtonian (PN) BBH system is therefore imperative. In this paper, we continue the work initiated by two of us in arXiv:2012.06586, where we presented four out of five actions of a BBH system with arbitrary eccentricity, masses, and spins, at 1.5PN order. Here we compute the remaining fifth action using a novel method of extending the phase space by introducing unmeasurable phase space coordinates. We detail how to compute all the frequencies, and sketch how to explicitly transform to angle variables, which analytically solves the dynamics at 1.5PN. This lays the groundwork to analytically solve the conservative dynamics of the BBH system with arbitrary masses, spins, and eccentricity, at higher PN order, by using canonical perturbation theory.Notes: How to take a derivative of a generalized continued fraction2021-08-14T06:00:00+00:002021-08-14T06:00:00+00:00https://duetosymmetry.com/notes/take-derivative-continued-fraction<aside class="sidebar__right">
<nav class="toc">
<header><h4 class="nav__title"><i class="fa fa-file-text"></i> On This Page</h4></header>
<ul class="toc__menu" id="markdown-toc">
<li><a href="#simple-continued-fractions" id="markdown-toc-simple-continued-fractions">Simple continued fractions</a></li>
<li><a href="#generalized-continued-fractions" id="markdown-toc-generalized-continued-fractions">Generalized continued fractions</a></li>
<li><a href="#how-to-take-a-derivative-of-a-generalized-continued-fraction" id="markdown-toc-how-to-take-a-derivative-of-a-generalized-continued-fraction">How to take a derivative of a generalized continued fraction</a> <ul>
<li><a href="#automatic-differentiation-point-of-view" id="markdown-toc-automatic-differentiation-point-of-view">Automatic differentiation point of view</a></li>
</ul>
</li>
<li><a href="#modified-lentz-method" id="markdown-toc-modified-lentz-method">Modified Lentz method</a></li>
<li><a href="#derivatives-in-modified-lentz-method" id="markdown-toc-derivatives-in-modified-lentz-method">Derivatives in modified Lentz method</a></li>
<li><a href="#code-example" id="markdown-toc-code-example">Code example</a></li>
<li><a href="#references" id="markdown-toc-references">References</a></li>
</ul>
</nav>
</aside>
<p>Entire books have been written about generalized continued
fractions<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>⁻<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>, and there is a great review article on numerical
evaluation<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>. Of course the <a href="https://en.wikipedia.org/wiki/Continued_fraction">article on
Wikipedia</a> is also
good. But I didn’t find an explanation of how to compute a derivative
of a generalized continued fraction in any of these, which is why I
wrote up these notes. Anyway, let’s start at the beginning.</p>
<h1 id="simple-continued-fractions">Simple continued fractions</h1>
<p>Before we get into generalized continued fractions, we should start
with their predecessors, just plain continued fractions.
Traditionally, a continued fraction was a way to represent any real
number – for example, <script type="math/tex">\phi = (1+\sqrt{5})/2</script> – by the <em>closest</em>
rational approximations. A simple algorithm lets you generate an
infinite sequence of <em>integers</em> <script type="math/tex">[b_0; b_1, b_2, b_3 \ldots]</script> by
subtracting off one rational approximation and finding the integer
whose reciprocal is closest to the remainder. The sequence <script type="math/tex">[b_0;
b_1, b_2, b_3 \ldots]</script> is our way of denoting</p>
<div>
\begin{align}
\label{eq:tradan}
x = b_0 + \frac{1}{b_1 + \frac{1}{b_2 + \frac{1}{b_3 + \ldots}}}
\,.
\end{align}
</div>
<p>Nesting all those fractions can make things look messy, so people
usually resort to a kind of hacky notation,</p>
<div>
\begin{align}
\label{eq:notation}
x = b_0 + \frac{1}{b_1 +} \frac{1}{b_2 +} \frac{1}{b_3 + }\ldots
\,.
\end{align}
</div>
<p>If this sequence terminates, then it’s a rational number.
For a number like <script type="math/tex">\phi</script>, there is an obvious pattern,</p>
<div>
\begin{align}
\label{eq:phi}
\phi = [1; 1, 1, 1 \ldots] = 1 + \frac{1}{1 +} \frac{1}{1 +}
\frac{1}{1 + }\ldots
\,.
\end{align}
</div>
<p>In fact, a “quadratic irrational” will have a continued fraction with
a periodic sequence of integers! Other numbers have patterns without
repeating, while some just don’t have any pattern you can spot…</p>
<div>
\begin{align}
\label{eq:examples}
e &= [2;1,2,1,1,4,1,1,6,1,1,8 \ldots] \\
\pi &= [3; 7, 15, 1, 292, 1, 1, 1, 2, 1, \ldots]
\,.
\end{align}
</div>
<p>Obviously we’re not going to be taking derivatives of constants, so
let’s move on to generalized continued fractions.</p>
<h1 id="generalized-continued-fractions">Generalized continued fractions</h1>
<p>It’s not hard to imagine how to generalize the classical continued
fractions so that they depend on some parameter <script type="math/tex">x</script>. This kind of
thing actually comes up quite naturally in the theory of three term
recurrence relations, which is used for solving certain differential
equations – and lots of other places in math. The continued fraction
representation sometimes converges much more quickly than other ways
of computing a function. A generalized continued fraction looks like</p>
<div>
\begin{align}
\label{eq:general}
f(x) = b_0(x) + \frac{a_1(x)}{b_1(x) + }\frac{a_2(x)}{b_2(x) + }
\frac{a_3(x)}{b_3(x) + } \ldots \,.
\end{align}
</div>
<p>So, now we have two sequences, <script type="math/tex">a_i</script> and <script type="math/tex">b_i</script>, and they both
depend on some parameter <script type="math/tex">x</script> (or multiple parameters!). One example
is</p>
<div>
\begin{align}
\label{eq:arctan}
\text{arctan} x = 0 + \frac{x}{1+} \frac{x^2}{3+} \frac{(2x)^2}{5+}
\frac{(3x)^2}{7+}\ldots
\,,
\end{align}
</div>
<p>that is, it’s given by the two sequences</p>
<div>
\begin{align}
\label{eq:arctan_a_b}
a_1(x) = x, \ a_i(x) &= (i-1)^2 x^2 \\
b_0(x) = 0, \ b_i(x) &= 2i-1
\,.
\end{align}
</div>
<p>If we throw away all the terms with <script type="math/tex">i>n</script> for some <script type="math/tex">n</script>, then we
get a sequence of “approximants” or “convergents”. For example let’s
say we use this above continued fraction to try to evaluate <script type="math/tex">\pi = 4
\text{arctan}(1)</script>, by evaluating at <script type="math/tex">x=1</script> and multiplying the final
result by 4. Then we would get the sequence</p>
<div>
\begin{align}
\label{eq:pi_conv}
\pi \approx 0, 4, 3, \frac{19}{6}, \frac{160}{51}, \frac{1744}{555},
\frac{644}{205}, \ldots
\end{align}
</div>
<p>The last of these is good to about 0.004% (note that this is not as
good as the <em>best</em> continued fraction for <script type="math/tex">\pi</script> with the same number
of terms, but that is a different question).</p>
<h1 id="how-to-take-a-derivative-of-a-generalized-continued-fraction">How to take a derivative of a generalized continued fraction</h1>
<p>Suppose we’re given a function <script type="math/tex">f(x)</script> that we <em>only</em> know in terms
of its continued fraction representation, and we want to compute its
derivative <script type="math/tex">f'(x)</script>. The first thing you might try (well, that I
tried) is to apply the quotient rule and chain rule on the expression
in Eq. \eqref{eq:general}. This leads to an explosion of algebra but
not an answer.</p>
<p>Instead of working with the notation of an infinite nested fraction,
we will instead think about the value of a continued fraction in terms
of its convergents. That is, when the CF converges, its value is the
limit of the convergents,</p>
<div>
\begin{align}
\label{eq:f_from_conv}
f = \lim_{n\to \infty} \frac{A_n}{B_n}
\,.
\end{align}
</div>
<p>Here you truncate the terms with <script type="math/tex">i>n</script>, then do the algebra to clear
out denominators and make an ordinary fraction <script type="math/tex">A_n/B_n</script>, with no
nested fractions. This sequence of ratios starts off</p>
<div>
\begin{align}
\label{eq:convergents}
f \approx b_0, \frac{b_1 b_0 + a_1}{b_1}, \frac{b_2 (a_1+b_1 b_0) +
a_2 b_0}{b_2 b_1 + a_2},
\ldots
\end{align}
</div>
<p>All the way back in 1655/6 in his text <a href="https://archive.org/details/ArithmeticaInfinitorum"><em>Arithmetica
Infinitorum</em></a>,
John Wallis showed that the <script type="math/tex">A_n</script> and <script type="math/tex">B_n</script> satisfy a recurrence
(which you can prove by induction),</p>
<div>
\begin{align}
\label{eq:A_recurrence}
A_{n+1} &= b_{n+1} A_n + a_{n+1} A_{n-1} \,, \\
\label{eq:B_recurrence}
B_{n+1} &= b_{n+1} B_n + a_{n+1} B_{n-1} \,,
\end{align}
</div>
<p>which starts off with a fake “-1” term and the zeroth term,</p>
<div>
\begin{align}
\label{eq:AB_init}
A_{-1} = 1, \ B_{-1} = 0, \ A_0 = b_0, \ B_0 = 1 \,.
\end{align}
</div>
<p>As an aside, the <script type="math/tex">A</script>’s and <script type="math/tex">B</script>’s may grow exponentially<sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup> and
lead to a loss of precision on the computer. To avoid this, there are
various improvements to Wallis’s original algorithm, one of which we
will <a href="#modified-lentz-method">discuss below</a>.</p>
<p>So what does this have to do with evaluating the derivative? Well,
starting from the limit definition of the CF, and assuming the CF
converges absolutely in a neighborhood so we can bring the derivative
inside the limit, we will find the derivative from</p>
<div>
\begin{align}
\label{eq:df_limit}
\frac{df}{dx} = \lim_{n\to\infty} \frac{d}{dx} \frac{A_n}{B_n}
= \lim_{n\to\infty} \frac{A_n'(x) \ B_n - A_n \ B_n'(x)}{B_n^2}
\,.
\end{align}
</div>
<p>So, if we know how to compute the derivatives <script type="math/tex">A_n'(x)</script> and
<script type="math/tex">B_n'(x)</script>, we’ll be in business. All we have to do is differentiate
Eqs. \eqref{eq:A_recurrence} and \eqref{eq:B_recurrence} to get the
recurrence relations,</p>
<div>
\begin{align}
\label{eq:dA_recur}
A_{n+1}' &= b_{n+1}' A_n + b_{n+1} A_n' + a_{n+1}' A_{n-1} + a_{n+1} A_{n-1}' \, \\
\label{eq:dB_recur}
B_{n+1}' &= b_{n+1}' B_n + b_{n+1} B_n' + a_{n+1}' B_{n-1} + a_{n+1} B_{n-1}' \,.
\end{align}
</div>
<p>Here think of the <script type="math/tex">a_i'(x)</script> and <script type="math/tex">b_i'(x)</script> as derivatives that you
calculate by hand, since you know the original functions; but the
terms <script type="math/tex">A'_i</script> and <script type="math/tex">B'_i</script> as values in a recurrence that we compute
from the bottom up. Of course we need initial values, which we get by
differentiating Eq. \eqref{eq:AB_init},</p>
<div>
\begin{align}
\label{eq:dAB_init}
A'_{-1} = 0, \ B'_{-1} = 0, \ A'_0 = b_0', \ B'_0 = 0 \,.
\end{align}
</div>
<p>Hopefully it is clear how to generalize this to continued fractions
that depend on <em>k</em> variables: each of the derivatives above is just
replaced by a <em>k</em>-dimensional gradient (co)vector, and the result of
all the recurrences is the (<em>k</em>-dimensional) gradient <em>df</em>.</p>
<p>Similarly, if you want the Hessian matrix, or any higher derivative
tensor, just apply more partial derivatives, and keep track of more
auxiliary variables.</p>
<h2 id="automatic-differentiation-point-of-view">Automatic differentiation point of view</h2>
<p>As a note here, I should thank <a href="https://rcorless.github.io/">Rob
Corless</a> who helped me ensure I
understood how to do the above, and emphasized another point of view.
This point of view is as follows: we should not distinguish between
(a) some abstract mathematical function and (b) a computer algorithm
that can be used to produce arbitrarily precise numerical values from
that function. Just like power series or integrals or continued
fractions or other representations, a computer algorithm <em>is</em> a
representation of a function. Now for different representations of
<script type="math/tex">f(x)</script>, we may find various representations of <script type="math/tex">f'(x)</script> – maybe a
series or integral or the algorithm up above.</p>
<p>How, in general, do you find the derivative of a numerical algorithm?
Your first temptation might be to use finite difference. But we can
do much better. In fact, every algorithm (for a differentiable
function) contains basically <em>all</em> the information on how to compute
its derivative (or gradient, with more arguments). Usually this is
expressed in terms of “<a href="https://en.wikipedia.org/wiki/Automatic_differentiation">automatic
differentiation</a>”
and/or “<a href="https://en.wikipedia.org/wiki/Dual_number">dual numbers</a>”.
Replace every number <em>x</em> with a pair <script type="math/tex">(x, x')</script>. Now define
algebraic operations on these dual numbers, for example,</p>
<div>
\begin{align}
\label{eq:dual_algebra}
(x, x') + c (y, y') &= (x+ c y, x' + c y') \,, \\
(x, x') (y, y') &= (x \, y, x y' + x' y) \,, \\
\frac{(x, x')}{(y, y')} &= \left( \frac{x}{y}, \frac{x' y - x y'}{y^2}
\right) \,,
\end{align}
</div>
<p>and so on. Here we see that second argument is just expressing
linearity of the derivative, the product rule, and the quotient rule.
Now as your algorithm is doing some calculation, it is also keeping
track of the derivative – as long as it knows a few basic rules like
these.</p>
<p>If you are working with a language that allows polymorphism or
generics, then you can promote any numerical algorithm to one that can
automatically compute its own derivative (in “forward mode”
auto-diff). Just build an algebraic type for dual numbers and
overload all of its arithmetic operations. You can also make
specializations for special functions when you can compute the
derivative by hand, for example</p>
<div>
\begin{align}
\label{eq:dual_sin}
\sin((x, x')) = (\sin(x), \cos(x) x')
\,.
\end{align}
</div>
<p>So, if you have an implementation of an algorithm for computing a
continued fraction, then you can automatically get an algorithm for
computing the derivative of a continued fraction. Or, you can
implement the derivative of Wallis’s algorithm above, or for the
modified Lentz method below.</p>
<h1 id="modified-lentz-method">Modified Lentz method</h1>
<p>To avoid the possibly exponential growth of the <script type="math/tex">A_n</script> and <script type="math/tex">B_n</script>
coefficients, the modified Lentz method<sup id="fnref:4:1"><a href="#fn:4" class="footnote">4</a></sup> instead constructs a
recurrence for their successive ratios,</p>
<div>
\begin{align}
\label{eq:CD_def}
C_n\equiv \frac{A_n}{A_{n-1}}, \ D_n \equiv \frac{B_{n-1}}{B_n}
\,,
\end{align}
</div>
<p>(but we never actually need the <script type="math/tex">A</script>’s or <script type="math/tex">B</script>’s). From the
recurrence relations for <script type="math/tex">A_n</script> and <script type="math/tex">B_n</script>, we get the new
recurrence relations</p>
<div>
\begin{align}
\label{eq:CD_recur}
C_n = b_n + a_n / C_{n-1} \,, \
D_n = 1/(b_n + a_n D_{n-1})
\,.
\end{align}
</div>
<p>And finally to compute the CF, we multiply these successive ratios
together,</p>
<div>
\begin{align}
\label{eq:f_Lentz}
f_n = f_{n-1} C_n D_n
\,.
\end{align}
</div>
<p>As before we need to start the recurrence with initial conditions,</p>
<div>
\begin{align}
\label{eq:lentz_IC}
f_0 = C_0 = b_0 \,, \ D_0 = 0 \,.
\end{align}
</div>
<p>But, there is a danger in the Lentz method, because of the division
steps involved in Eq. \eqref{eq:CD_recur}. To avoid this potential
pitfall, <script type="math/tex">C_n</script> is set to a tiny but non-zero value if it ever
exactly cancels (this includes in the initial condition
\eqref{eq:lentz_IC}). Similarly, if the denominator of <script type="math/tex">D_n</script> ever
exactly cancels, then <script type="math/tex">D_n</script> is replaced with the reciprocal of that
tiny number.</p>
<p>Finally, this algorithm needs a stopping condition. This is usually
determined by testing if the absolute change <script type="math/tex">|1-C_n D_n|</script> is
smaller than your desired tolerance.</p>
<h1 id="derivatives-in-modified-lentz-method">Derivatives in modified Lentz method</h1>
<p>To compute the derivative of a CF using the modified Lentz method, we
again assume we’re handed methods to compute <script type="math/tex">a'_n(x)</script> and
<script type="math/tex">b'_n(x)</script> (or <em>k</em>-dimensional gradients). Then we differentiate all
the recurrence relations above, to find a recurrence for the
derivatives,</p>
<div>
\begin{align}
\label{eq:lentz_der}
C'_n &= b'_n + (a'_n C_{n-1} - a_n C'_{n-1}) / C_{n-1}^2 \,, \\
D'_n &= - D_n^2 (b'_n + a'_n D_{n-1} + a_n D'_{n-1}) \,, \\
f'_n &= f'_{n-1} C_n D_n + f_{n-1} C'_n D_n + f_{n-1} C_n D'_n \,.
\end{align}
</div>
<p>Here the only dangerous division is by <script type="math/tex">C_{n-1}</script>, which is replaced
with a tiny number if it exactly vanishes.</p>
<h1 id="code-example">Code example</h1>
<p>Pseudocode for the modified Lentz method is listed in<sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup> or in the
freely-available article<sup id="fnref:4:2"><a href="#fn:4" class="footnote">4</a></sup>. I implemented this in python in my
package <a href="https://github.com/duetosymmetry/qnm"><code class="highlighter-rouge">qnm</code></a>, since computing
the quasinormal mode frequencies of black holes requires finding roots
of continued fraction equations. Here let me list a version that
would also compute a derivative of the continued fraction at the same
time. The user should specify functions <code class="highlighter-rouge">a, b, da, db</code> that will
return the values of <script type="math/tex">a_n, b_n, a'_n, b'_n</script>. I have also modified
the stopping condition so that it can be made to perform a minimum or
maximum number of iterations (steps of the recursion).</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="nn">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="k">def</span> <span class="nf">lentz_with_grad</span><span class="p">(</span><span class="n">a</span><span class="p">,</span> <span class="n">b</span><span class="p">,</span> <span class="n">da</span><span class="p">,</span> <span class="n">db</span><span class="p">,</span>
<span class="n">args</span><span class="o">=</span><span class="p">(),</span>
<span class="n">tol</span><span class="o">=</span><span class="mf">1.e-10</span><span class="p">,</span>
<span class="n">N_min</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">N_max</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">Inf</span><span class="p">,</span>
<span class="n">tiny</span><span class="o">=</span><span class="mf">1.e-30</span><span class="p">):</span>
<span class="s">"""Compute a continued fraction (and its derivative) via modified
Lentz's method.
This implementation is by the book [1]_. The value to compute is:
b_0 + a_1/( b_1 + a_2/( b_2 + a_3/( b_3 + ...)))
where a_n = a(n, *args) and b_n = b(n, *args).
Parameters
----------
a: callable returning numeric.
b: callable returning numeric.
da: callable returning array-like.
db: callable returning array-like.
args: tuple [default: ()]
Additional arguments to pass to the user-defined functions a, b,
da, and db. If given, the additional arguments are passed to
all user-defined functions, e.g. `a(n, *args)`. So if, for
example, `a` has the signature `a(n, x, y)`, then `b` must have
the same signature, and `args` must be a tuple of length 2,
`args=(x,y)`.
tol: float [default: 1.e-10]
Tolerance for termination of evaluation.
N_min: int [default: 0]
Minimum number of iterations to evaluate.
N_max: int or comparable [default: np.Inf]
Maximum number of iterations to evaluate.
tiny: float [default: 1.e-30]
Very small number to control convergence of Lentz's method when
there is cancellation in a denominator.
Returns
-------
(float, array-like, float, int)
The first element of the tuple is the value of the continued
fraction.
The second element is the gradient.
The third element is the estimated error.
The fourth element is the number of iterations.
References
----------
.. [1] WH Press, SA Teukolsky, WT Vetterling, BP Flannery,
"Numerical Recipes," 3rd Ed., Cambridge University Press 2007,
ISBN 0521880688, 9780521880688 .
"""</span>
<span class="k">if</span> <span class="ow">not</span> <span class="nb">isinstance</span><span class="p">(</span><span class="n">args</span><span class="p">,</span> <span class="nb">tuple</span><span class="p">):</span>
<span class="n">args</span> <span class="o">=</span> <span class="p">(</span><span class="n">args</span><span class="p">,)</span>
<span class="n">f_old</span> <span class="o">=</span> <span class="n">b</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="k">if</span> <span class="p">(</span><span class="n">f_old</span> <span class="o">==</span> <span class="mi">0</span><span class="p">):</span>
<span class="n">f_old</span> <span class="o">=</span> <span class="n">tiny</span>
<span class="n">C_old</span> <span class="o">=</span> <span class="n">f_old</span>
<span class="n">D_old</span> <span class="o">=</span> <span class="mf">0.</span>
<span class="c"># f_0 = b_0, so df_0 = db_0</span>
<span class="n">df_old</span> <span class="o">=</span> <span class="n">db</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="n">dC_old</span> <span class="o">=</span> <span class="n">df_old</span>
<span class="n">dD_old</span> <span class="o">=</span> <span class="mf">0.</span>
<span class="n">conv</span> <span class="o">=</span> <span class="bp">False</span>
<span class="n">j</span> <span class="o">=</span> <span class="mi">1</span>
<span class="k">while</span> <span class="p">((</span><span class="ow">not</span> <span class="n">conv</span><span class="p">)</span> <span class="ow">and</span> <span class="p">(</span><span class="n">j</span> <span class="o"><</span> <span class="n">N_max</span><span class="p">)):</span>
<span class="n">aj</span><span class="p">,</span> <span class="n">bj</span> <span class="o">=</span> <span class="n">a</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">),</span> <span class="n">b</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="n">daj</span><span class="p">,</span> <span class="n">dbj</span> <span class="o">=</span> <span class="n">da</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">),</span> <span class="n">db</span><span class="p">(</span><span class="n">j</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">)</span>
<span class="c"># First: modified Lentz</span>
<span class="n">D_new</span> <span class="o">=</span> <span class="n">bj</span> <span class="o">+</span> <span class="n">aj</span> <span class="o">*</span> <span class="n">D_old</span>
<span class="k">if</span> <span class="p">(</span><span class="n">D_new</span> <span class="o">==</span> <span class="mi">0</span><span class="p">):</span>
<span class="n">D_new</span> <span class="o">=</span> <span class="n">tiny</span>
<span class="n">D_new</span> <span class="o">=</span> <span class="mf">1.</span><span class="o">/</span><span class="n">D_new</span>
<span class="n">C_new</span> <span class="o">=</span> <span class="n">bj</span> <span class="o">+</span> <span class="n">aj</span> <span class="o">/</span> <span class="n">C_old</span>
<span class="k">if</span> <span class="p">(</span><span class="n">C_new</span> <span class="o">==</span> <span class="mi">0</span><span class="p">):</span>
<span class="n">C_new</span> <span class="o">=</span> <span class="n">tiny</span>
<span class="n">Delta</span> <span class="o">=</span> <span class="n">C_new</span> <span class="o">*</span> <span class="n">D_new</span>
<span class="n">f_new</span> <span class="o">=</span> <span class="n">f_old</span> <span class="o">*</span> <span class="n">Delta</span>
<span class="c"># Second: the derivative calculations</span>
<span class="c"># The only possibly dangerous denominator is C_old,</span>
<span class="c"># but it can't be 0 (at worst it's "tiny")</span>
<span class="n">dC_new</span> <span class="o">=</span> <span class="n">dbj</span> <span class="o">+</span> <span class="p">(</span><span class="n">daj</span><span class="o">*</span><span class="n">C_old</span> <span class="o">-</span> <span class="n">aj</span><span class="o">*</span><span class="n">dC_old</span><span class="p">)</span><span class="o">/</span><span class="p">(</span><span class="n">C_old</span><span class="o">*</span><span class="n">C_old</span><span class="p">)</span>
<span class="n">dD_new</span> <span class="o">=</span> <span class="o">-</span><span class="n">D_new</span><span class="o">*</span><span class="n">D_new</span><span class="o">*</span><span class="p">(</span><span class="n">dbj</span> <span class="o">+</span> <span class="n">daj</span><span class="o">*</span><span class="n">D_old</span> <span class="o">+</span> <span class="n">aj</span><span class="o">*</span><span class="n">dD_old</span><span class="p">)</span>
<span class="n">df_new</span> <span class="o">=</span> <span class="n">df_old</span><span class="o">*</span><span class="n">Delta</span> <span class="o">+</span> <span class="n">f_old</span><span class="o">*</span><span class="n">dC_new</span><span class="o">*</span><span class="n">D_new</span> <span class="o">+</span> <span class="n">f_old</span><span class="o">*</span><span class="n">C_new</span><span class="o">*</span><span class="n">dD_new</span>
<span class="c"># Did we converge?</span>
<span class="k">if</span> <span class="p">((</span><span class="n">j</span> <span class="o">></span> <span class="n">N_min</span><span class="p">)</span> <span class="ow">and</span> <span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="nb">abs</span><span class="p">(</span><span class="n">Delta</span> <span class="o">-</span> <span class="mf">1.</span><span class="p">)</span> <span class="o"><</span> <span class="n">tol</span><span class="p">)):</span>
<span class="n">conv</span> <span class="o">=</span> <span class="bp">True</span>
<span class="c"># Set up for next iter</span>
<span class="n">j</span> <span class="o">=</span> <span class="n">j</span> <span class="o">+</span> <span class="mi">1</span>
<span class="n">C_old</span> <span class="o">=</span> <span class="n">C_new</span>
<span class="n">D_old</span> <span class="o">=</span> <span class="n">D_new</span>
<span class="n">f_old</span> <span class="o">=</span> <span class="n">f_new</span>
<span class="n">dC_old</span> <span class="o">=</span> <span class="n">dC_new</span>
<span class="n">dD_old</span> <span class="o">=</span> <span class="n">dD_new</span>
<span class="n">df_old</span> <span class="o">=</span> <span class="n">df_new</span>
<span class="c"># Success or failure can be assessed by the user</span>
<span class="k">return</span> <span class="n">f_new</span><span class="p">,</span> <span class="n">df_new</span><span class="p">,</span> <span class="n">np</span><span class="o">.</span><span class="nb">abs</span><span class="p">(</span><span class="n">Delta</span> <span class="o">-</span> <span class="mf">1.</span><span class="p">),</span> <span class="n">j</span><span class="o">-</span><span class="mi">1</span>
</code></pre></div></div>
<p>Now let’s demonstrate this with a calculation of the continued
fraction for <script type="math/tex">\tan(x)</script>. Of course the tangent function is already
included in almost any numerical library… and from first year
calculus, we know that its derivative is <script type="math/tex">\tan'(x) = \sec^2(x)</script>,
which we can also compute from a numerical library. But sometimes we
don’t have these luxuries! Anyway, the continued fraction is</p>
<div>
\begin{align}
\label{eq:tan_CF}
\tan(x) = 0 + \frac{x}{1-}\frac{x^2}{3-}\frac{x^2}{5-}\ldots,
\end{align}
</div>
<p>which is given by the two sequences</p>
<div>
\begin{align}
\label{eq:tan_ab}
a_1(x) &= x \,, & a_i(x) &= -x^2 \,, \\
b_0(x) &= 0 \,, & b_i(x) &= 2i-1 \,.
\end{align}
</div>
<p>Taking derivatives we immediately get</p>
<div>
\begin{align}
\label{eq:tan_dab}
a'_1(x) &= 1 \,, & a'_i(x) &= -2x \,, \\
&& b'_i(x) &= 0 \,.
\end{align}
</div>
<p>We can code these up in a few lines:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">tanx_a</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="n">x</span> <span class="k">if</span> <span class="n">n</span><span class="o">==</span><span class="mi">1</span> <span class="k">else</span> <span class="o">-</span><span class="n">x</span><span class="o">*</span><span class="n">x</span>
<span class="k">def</span> <span class="nf">tanx_b</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="mf">0.</span> <span class="k">if</span> <span class="n">n</span><span class="o">==</span><span class="mi">0</span> <span class="k">else</span> <span class="mi">2</span><span class="o">*</span><span class="n">n</span><span class="o">-</span><span class="mi">1</span>
<span class="k">def</span> <span class="nf">tanx_da</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="mf">1.</span> <span class="k">if</span> <span class="n">n</span><span class="o">==</span><span class="mi">1</span> <span class="k">else</span> <span class="o">-</span><span class="mi">2</span><span class="o">*</span><span class="n">x</span>
<span class="k">def</span> <span class="nf">tanx_db</span><span class="p">(</span><span class="n">n</span><span class="p">,</span> <span class="n">x</span><span class="p">):</span>
<span class="k">return</span> <span class="mf">0.</span>
</code></pre></div></div>
<p>And finally, let’s call the routine to evaluate the continued fraction
<script type="math/tex">\tan(1)</script>, which will also compute the derivative,
<script type="math/tex">\sec^2(1)</script>. We specify <script type="math/tex">x=1</script> with the parameter <code class="highlighter-rouge">args=1.</code> (which
should be a tuple for functions that take more than one parameter).
Let’s ask for 15 digits of accuracy:</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">>>></span> <span class="n">lentz_with_grad</span><span class="p">(</span><span class="n">tanx_a</span><span class="p">,</span> <span class="n">tanx_b</span><span class="p">,</span>
<span class="o">...</span> <span class="n">tanx_da</span><span class="p">,</span> <span class="n">tanx_db</span><span class="p">,</span>
<span class="o">...</span> <span class="n">args</span><span class="o">=</span><span class="mf">1.</span><span class="p">,</span> <span class="n">tol</span><span class="o">=</span><span class="mf">1.e-15</span><span class="p">)</span>
<span class="p">(</span><span class="mf">1.5574077246549018</span><span class="p">,</span> <span class="mf">3.4255188208147596</span><span class="p">,</span> <span class="mf">2.220446049250313e-16</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span>
</code></pre></div></div>
<p>The return value tells us that <script type="math/tex">\tan(1)\approx 1.5574077246549018</script>,
<script type="math/tex">\sec^2(1) \approx 3.4255188208147596</script>, the estimated error on the
function value is <script type="math/tex">\approx 2.2\times 10^{-16}</script>, and it only took 10
iterations to compute these two numbers! Just for peace of mind,
let’s check these values with the library routines in <code class="highlighter-rouge">numpy</code> (which
are really from <code class="highlighter-rouge">libm</code>)</p>
<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">>>></span> <span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">tan</span><span class="p">(</span><span class="mf">1.</span><span class="p">),</span> <span class="mi">1</span><span class="o">/</span><span class="n">np</span><span class="o">.</span><span class="n">cos</span><span class="p">(</span><span class="mf">1.</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
<span class="p">(</span><span class="mf">1.557407724654902</span><span class="p">,</span> <span class="mf">3.425518820814759</span><span class="p">)</span>
</code></pre></div></div>
<p>They agree! And indeed, the difference between the CF approach and the
result from the standard library is about <script type="math/tex">\approx 2\times
10^{-16}</script>, in agreement with the estimated error.</p>
<h1 id="references">References</h1>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Hall, <em>Analytic theory of continued fractions</em>, (1948), Chelsea
publishing company. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Cuyt <em>et al.</em>, <em>Handbook for continued fractions for special
functions</em>, (2008), Springer. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Blanch, <em>Numerical Evaluation of Continued Fractions</em>, <a href="https://doi.org/10.1137/1006092">SIAM
Review, 6(4), 383-421 (1964)</a>. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>Press and Teukolsky, <em>Evaluating continued fractions and
computing exponential integrals</em>, <a href="https://doi.org/10.1063/1.4822777">Computers in Physics, 2(5),
88-89 (1988)</a>. <a href="#fnref:4" class="reversefootnote">↩</a> <a href="#fnref:4:1" class="reversefootnote">↩<sup>2</sup></a> <a href="#fnref:4:2" class="reversefootnote">↩<sup>3</sup></a></p>
</li>
<li id="fn:5">
<p>Press <em>et al.</em>, <a href="http://numerical.recipes/">Numerical Recipes</a>,
3rd ed. (2007), Cambridge University Press. <a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Leo C. Steinlcstein@olemiss.eduSo you have a function in terms of a continued fraction, and you want to compute its derivative...Surprises in a classic boundary-layer problem2021-07-26T00:00:00+00:002021-07-26T00:00:00+00:00https://duetosymmetry.com/pubs/surprises-BVP<p class="align-right" style="width: 350px"><img src="https://duetosymmetry.com/images/BVP-pitchfork.jpg" alt="" /></p>
<blockquote>
<p>We revisit a textbook example of a singularly perturbed nonlinear
boundary-value problem. Unexpectedly, it shows a wealth of phenomena
that seem to have been overlooked previously, including a pitchfork
bifurcation in the number of solutions as one varies the small
parameter, and transcendentally small terms in the initial
conditions that can be calculated by elementary means. Based on our
own classroom experience, we believe this problem could provide an
enjoyable workout for students in courses on perturbation methods,
applied dynamical systems, or numerical analysis.</p>
</blockquote>Leo C. Steinlcstein@olemiss.eduWe revisit a textbook example of a singularly perturbed nonlinear boundary-value problem. Unexpectedly, it shows a wealth of phenomena that seem to have been overlooked previously, including a pitchfork bifurcation in the number of solutions as one varies the small parameter, and transcendentally small terms in the initial conditions that can be calculated by elementary means. Based on our own classroom experience, we believe this problem could provide an enjoyable workout for students in courses on perturbation methods, applied dynamical systems, or numerical analysis.