Fig. 1 - Defining Spectral Trees

Author

Benjamin Doran

Published

January 17, 2025

Julia Setup

using DrWatson
@quickactivate projectdir()

using SpectralInference # custom package from $projectdir/src
using NewickTree
using Gotree_jll, Goalign_jll
using StatsBase, NeighborJoining
using Distances, Clustering
using DataFrames, CSV
using Muon
using Symbolics
using StatsPlots
theme(:default, grid=false, label=false, tickdir=:out)
using LinearAlgebra
using LaTeXStrings
include(srcdir("helpers.jl"))
heatmapcols = [:purple, :black, :yellow];
generationcols = ["#B3D2FC" "#2B689E" "#338591"];

Main - Exploration of toy evolution

To explore if strain-level diversity can be inferred from genomic co-variation patterns at all, we started with an in silico model of sequential diversification. An ancestral root is defined by 14-bit string of ‘1’. Diversification through three generations (G1, G2, and G3) is done by inducing “knockout mutations” by flipping a ‘1’ to ‘0’. Each split induces 2 mutations, one for each created sub-population. After three generations we have created 8 unique sub-populations existing at the current time. The alignment of these extant populations is shown below.

# note how the zeros creates 3 layers of bifurcations
M = Float64.([
#   G1   G2       G3
    0 1  0 1 1 1  0 1 1 1 1 1 1 1; # subpopulation a
    0 1  0 1 1 1  1 0 1 1 1 1 1 1;
    0 1  1 0 1 1  1 1 0 1 1 1 1 1;
    0 1  1 0 1 1  1 1 1 0 1 1 1 1;
    1 0  1 1 0 1  1 1 1 1 0 1 1 1;
    1 0  1 1 0 1  1 1 1 1 1 0 1 1;
    1 0  1 1 1 0  1 1 1 1 1 1 0 1;
    1 0  1 1 1 0  1 1 1 1 1 1 1 0; # subpopulation h
]);

# spectrally decompose the alignment
usv = svd(M);

# show explained variance of each principal component
plot(
    ylabel="explained variance (%)", 
    xlabel="spectral component", 
    legend=:none, 
    ylims=(0,1), xlims=(.5,8.5), 
    size=(500,500)
)
scatter!((usv.S.^2)/sum(usv.S.^2), 
    xticks=1:14, c=:black, markersize=5,
)
annotate!((1:length(usv.S)) .+ .1 , (usv.S.^2)/sum(usv.S.^2).+.06, round.((usv.S.^2)/sum(usv.S.^2)*100, digits=1))

we see that 81% of variation is explained by the first principal component

# show projection of each taxa onto each principal component
heatmap(usv.U, 
    c=:viridis,
    framestyle=:box,
    ratio=1,
    xmirror=true,
    yflip=true,
    yticks=(1:8, 'a':'h'),
    xticks=(1:8, ["PC $i" for i in 1:8]),
    margin=5Plots.mm,
    size=(500,530),
    xlabel="spectral component",
    ylabel="taxa",
    title="left singular vectors",
)

yet when we look at how taxa project onto each PC, all taxa are indistiguishable on PC1, Yet we do see PCs that split our most recent evolutionary splits. They just happen to be PCs 5-8.

So why does PC1 both “explain 81% of variance” and also not differentiate between any of the taxa? The mathematical reason is that we didn’t mean center our data. So PC1 is telling us how far our data is from the origin collectively. Even if we did mean center our data by subtracting the mean from each column, all that happens is that the PCs get shifted left by 1. So what is now PC2 would move to PC1 etc.

The key point here is that information describing local differences between taxa are embedded in deep components. In this case information about the divergences at generation 3 are embedded into components 5-8. Mean centering would not move the information regarding local differences into the portion of the spectrum that we usually analyze (PCs 1-10). Additionally, given that we know this data is hierarchical (i.e. generated from a tree), it can make biological sense to not mean center and instead use PC 1 as an indicator of the common traits common to all taxa. In the case that we did have sub-populations with no traits in common (i.e. orthogonal populations), PC 1 would then show a bifurcation. So, we are not mandating hierarchy in the spectral decomposition, simply using the factorization as a readout of organization.

We can leverage these insights into instantiating a distance metric that incorporates the projections of taxa across all components.

# calculate euclidean distance between each taxa-taxa pair on each left singular vector
spires = spectraldistances_trace(usv.U, usv.S, [i:i for i in 1:8])
# cumulative sum of distances for each pair along each LSV
spires = mapslices(cumsum, spires, dims=1)

# masks of pairs that diverged only at <xx> generation
trilmask = tril(trues(size(usv.U)),-1)
G1mask = Bool.(kron([0 0; 1 0],[1 1;1 1], [1 1;1 1]))
G2mask = Bool.(kron([1 0; 0 1], [0 0; 1 0], [1 1; 1 1]))
G3mask = Bool.(kron([1 0; 0 1],[1 0; 0 1], [0 0; 1 0]));

plot([0 0 0], 
    label=["G1" "G2" "G3"], 
    c=generationcols,
    linestyle=[:solid :dash :dot],
    linewidth=3,
    size=(650, 300),
    legend=:topleft,
    xticks=1:8,
    ylabel="cumulative spectral distance",
    xlabel="principal components",
    margin=5Plots.mm,
)
plot!(spires[:, G1mask[trilmask]],linewidth=3, c=generationcols[1],linestyle=:solid, label="")
plot!(spires[:, G2mask[trilmask]],linewidth=3, c=generationcols[2],linestyle=:dash,  label="")
plot!(spires[:, G3mask[trilmask]],linewidth=3, c=generationcols[3],linestyle=:dot,   label="")

note that all pairs that diverged at G1 (e.g. ‘a’ and ‘h’) have a positive distance at component 2;

Because later generations have more sub-populations (exponentially so), they also need more components to describe the relative distances between all of those subpopulations. If we group components based on the amount of variance explained and sum across the euclidean distances within each group, we find that pairs diverging at the same generation also diverge in spectral distance to the same degree.

partitioneddists = spectraldistances_trace(usv.U, usv.S, [1:1, 2:2, 3:4, 5:8])
cumsumpdists = cumsum(partitioneddists, dims=1);

plot([0 0 0], 
    label=["G1" "G2" "G3"], 
    c=generationcols,
    linestyle=[:solid :dash :dot],
    legend=:topleft,
    linewidth=5,
    size=(600, 300),
    xticks=(1:4, ["1:1", "2:2", "3:4", "5:8"]),
    ylabel="cumulative partitioned\nspectral distance",
    xlabel="principal component groups",
    margin=5Plots.mm,
)
plot!(cumsumpdists[:, G1mask[trilmask]], linewidth=4, c=generationcols[1],linestyle=:solid, label="")
plot!(cumsumpdists[:, G2mask[trilmask]], linewidth=4, c=generationcols[2],linestyle=:dash,  label="")
plot!(cumsumpdists[:, G3mask[trilmask]], linewidth=4, c=generationcols[3],linestyle=:dot,   label="")

colors = [:black, reverse(generationcols)...]
plot(layout=grid(1,4), size=(900, 200), ratio=1, colorbar=:none, bottommargin=5Plots.Measures.mm, ticks=(1:8, 'a':'h'))
heatmap!(trunc.(squareform(cumsumpdists[1,:]), digits=13), sp=1, c=colors, title="partition 1:1", leftmargin=5Plots.mm,)
heatmap!(squareform(cumsumpdists[2,:]), sp=2, c=colors, title="partition 1:2")
heatmap!(squareform(cumsumpdists[3,:]), sp=3, c=colors, title="partition 1:3")
heatmap!(trunc.(Int,squareform(cumsumpdists[4,:])), sp=4, c=colors, title="partition 1:4")

Plot partitioned distance matrix (Partitioned Weighted Euclidean) adding one additional partition left-right. Right most matrix cleanly shows all 3 generational differences

We can use this last distance matrix to infer the generative ancestral tree.

# sum of all partitioned distances
dij = squareform(cumsumpdists[4,:])

8×8 Matrix{Float64}:
 0.0      1.41421  2.31607  2.31607  3.08077  3.08077  3.08077  3.08077
 1.41421  0.0      2.31607  2.31607  3.08077  3.08077  3.08077  3.08077
 2.31607  2.31607  0.0      1.41421  3.08077  3.08077  3.08077  3.08077
 2.31607  2.31607  1.41421  0.0      3.08077  3.08077  3.08077  3.08077
 3.08077  3.08077  3.08077  3.08077  0.0      1.41421  2.31607  2.31607
 3.08077  3.08077  3.08077  3.08077  1.41421  0.0      2.31607  2.31607
 3.08077  3.08077  3.08077  3.08077  2.31607  2.31607  0.0      1.41421
 3.08077  3.08077  3.08077  3.08077  2.31607  2.31607  1.41421  0.0

ntree = SpectralInference.newickstring(UPGMA_tree(dij), string.('a':'h'))

"(((d:1.414214e+00,c:1.414214e+00):9.018605e-01,(b:1.414214e+00,a:1.414214e+00):9.018605e-01):7.646942e-01,((e:1.414214e+00,f:1.414214e+00):9.018605e-01,(h:1.414214e+00,g:1.414214e+00):9.018605e-01):7.646942e-01):0.000000e+00;"

plot(readnw(ntree))

# unrooted but same topology
ntree = NeighborJoining.newickstring(regNJ(dij), string.('a':'h'))

"(((b:7.071068e-01,a:7.071068e-01):4.509302e-01,((f:7.071068e-01,e:7.071068e-01):4.509302e-01,(h:7.071068e-01,g:7.071068e-01):4.509302e-01):7.646942e-01):2.254651e-01,(d:7.071068e-01,c:7.071068e-01):2.254651e-01):0.000000e+00;"

plot(readnw(ntree))

Standard phylogenetic tools

using Goalign_jll
using Gotree_jll
toyMSA_8x14_dir = datadir("sims", "toyMSA_8x14") |> mkpath

# write out alignment
writephylip(
    joinpath(toyMSA_8x14_dir, "MSA.phylip"),
    join.(eachrow((replace(M, 0.0 => 'A', 1.0 => 'T')))),
    collect('a':'h')
)

# FastME
run(`julia $(projectdir("scripts", "runners", "runFastME.jl")) 
    -i $(datadir("sims", "toyMSA_8x14", "MSA.phylip"))
    -o $(projectdir("_research", "toyMSA_8x14", "FastME"))
    -m JC69
`)
# PhyML
run(`julia $(projectdir("scripts", "runners", "runPhyML.jl")) 
    -i $(datadir("sims", "toyMSA_8x14", "MSA.phylip"))
    -o $(projectdir("_research", "toyMSA_8x14", "PhyML"))
    -m JC69
`)
# # RAxML (needs intel chip)
# run(`julia $(projectdir("scripts", "runners", "runRAxML.jl")) 
#     -i $(datadir("sims", "toyMSA_8x14", "MSA.phylip"))
#     -o $(projectdir("_research", "toyMSA_8x14", "RAxML"))
#     -m JC69
# `)
# MrBayes
run(`julia $(projectdir("scripts", "runners", "runMrBayes.jl")) 
    -i $(datadir("sims", "toyMSA_8x14", "MSA.phylip"))
    -o $(projectdir("_research", "toyMSA_8x14", "MrBayes"))
    -m JC69
`);
# SpectralInference
run(`julia $(projectdir("scripts", "runners", "runSPI.jl")) 
    -i $(datadir("sims", "toyMSA_8x14", "MSA.phylip"))
    -o $(projectdir("_research", "toyMSA_8x14", "SPI"))
    -m JC69
`);

  Activating project at `~/projects/Doran_etal_2023`

[ Info: Starting FastME on MSA
[ Info: using Booster to compute support values
[ Info: stopping run
 ──────────────────────────────────────────────────────────────────────
                              Time                    Allocations      
                     ───────────────────────   ────────────────────────
  Tot / % measured:       768ms /  90.5%           51.6MiB /  89.7%    

 Section     ncalls     time    %tot     avg     alloc    %tot      avg
 ──────────────────────────────────────────────────────────────────────
 total            1    695ms  100.0%   695ms   46.3MiB  100.0%  46.3MiB
   booster        1    432ms   62.2%   432ms   1.16MiB    2.5%  1.16MiB
   FastME         1    136ms   19.5%   136ms   3.29MiB    7.1%  3.29MiB
 ──────────────────────────────────────────────────────────────────────
┌ Info: timing
│   show(time) = nothing
└   println("") = nothing

  Activating project at `~/projects/Doran_etal_2023`

[ Info: Starting PhyML on MSA
[ Info: stopping run
 ────────────────────────────────────────────────────────────────────
                            Time                    Allocations      
                   ───────────────────────   ────────────────────────
 Tot / % measured:      318ms /  86.5%           50.1MiB /  89.4%    

 Section   ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────
 total          1    275ms  100.0%   275ms   44.8MiB  100.0%  44.8MiB
   PhyML        1    151ms   55.1%   151ms   2.97MiB    6.6%  2.97MiB
 ────────────────────────────────────────────────────────────────────
┌ Info: timing
│   show(time) = nothing
└   println("") = nothing

  Activating project at `~/projects/Doran_etal_2023`

[ Info: Converting input to nexus format
[ Info: changing dir to: /Users/bend/projects/Doran_etal_2023/_research/toyMSA_8x14/MrBayes
[ Info: changing dir to: /Users/bend/projects/Doran_etal_2023/notebooks
[ Info: stopping run
 ────────────────────────────────────────────────────────────────────────────────
                                        Time                    Allocations      
                               ───────────────────────   ────────────────────────
       Tot / % measured:            12.7s /  99.6%           63.3MiB /  91.6%    

 Section               ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────────────────
 total                      1    12.7s  100.0%   12.7s   58.0MiB  100.0%  58.0MiB
   mrbayes                  1    12.3s   96.9%   12.3s   12.4MiB   21.3%  12.4MiB
   convert input to...      1    245ms    1.9%   245ms   2.66MiB    4.6%  2.66MiB
   convert tree file        1   24.3ms    0.2%  24.3ms   1.17MiB    2.0%  1.17MiB
 ────────────────────────────────────────────────────────────────────────────────
┌ Info: timing
│   show(time) = nothing
└   println("") = nothing

  Activating project at `~/projects/Doran_etal_2023`

[ Info: Starting SPI inference
[ Info: Setting up workspace
[ Info: Running SPI
[ Info: Writing out SPI Tree
[ Info: Starting Bootstrap with 100
[ Info: Writing out Bootstrap trees
[ Info: using Booster to compute support values
[ Info: Finishing run
 ────────────────────────────────────────────────────────────────────────────────
                                        Time                    Allocations      
                               ───────────────────────   ────────────────────────
       Tot / % measured:            1.61s /  96.1%            289MiB /  98.0%    

 Section               ncalls     time    %tot     avg     alloc    %tot      avg
 ────────────────────────────────────────────────────────────────────────────────
 total                      1    1.55s  100.0%   1.55s    283MiB  100.0%   283MiB
   running SPI              1    889ms   57.5%   889ms    169MiB   59.5%   169MiB
   running bootstra...      1    169ms   10.9%   169ms   60.9MiB   21.5%  60.9MiB
 ────────────────────────────────────────────────────────────────────────────────
┌ Info: 
│ timing
│   show(time) = nothing
└   println("") = nothing

pdir_toyMSA_8x14 = plotsdir("toyMSA_8x14") |> mkpath

"/Users/bend/projects/Doran_etal_2023/plots/toyMSA_8x14"

In the tree plots below, only branches with an orange dot are statistically supported

method = "FastME"
fastme_tree = readnw(readline(projectdir("_research", "toyMSA_8x14", method, "MSA-supporttree.nw")))
as_polytomy!(n->NewickTree.support(n)<0.5, fastme_tree)
plot(fastme_tree)

method = "PhyML"
phylotree = readnw(readline(projectdir("_research", "toyMSA_8x14", method, "MSA.phylip-supporttree.txt")))
as_polytomy!(n->NewickTree.support(n)<0.5, phylotree)
plot(phylotree)

method = "RAxML"
phylotree = readnw(readline(projectdir("_research", "toyMSA_8x14", method, "$method-supporttree.nw")))
as_polytomy!(n->NewickTree.support(n)<0.5, phylotree)
plot(phylotree)

method = "MrBayes"
phylotree = readnw(readline(projectdir("_research", "toyMSA_8x14", method, "MSA-supporttree.nw")))
as_polytomy!(n->NewickTree.support(n)<0.5, phylotree)
plot(phylotree)

Base.IOError: IOError: stat("(a[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.04092275[&length_mean=5.43140517e-02,length_median=4.09227500e-02,length_95%HPD={8.67395700e-04,1.49905400e-01}],(e[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01460944[&length_mean=2.56496525e-02,length_median=1.46094400e-02,length_95%HPD={2.92031800e-05,7.78917800e-02}],f[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01559312[&length_mean=2.55470080e-02,length_median=1.55931200e-02,length_95%HPD={7.18743900e-05,8.11710000e-02}])[&prob=7.27030626e-01,prob(percent)=\"73\"]:0.01827144[&length_mean=3.10664235e-02,length_median=1.82714400e-02,length_95%HPD={9.34787800e-05,1.07984300e-01}],(g[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.0155863[&length_mean=2.54705410e-02,length_median=1.55863000e-02,length_95%HPD={1.78966700e-04,7.89868000e-02}],h[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01743166[&length_mean=2.66282161e-02,length_median=1.74316600e-02,length_95%HPD={2.07046800e-04,8.19463400e-02}])[&prob=7.25699068e-01,prob(percent)=\"73\"]:0.02024421[&length_mean=3.03930357e-02,length_median=2.02442100e-02,length_95%HPD={1.71152900e-05,9.46964500e-02}],(b[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.02953292[&length_mean=4.06685490e-02,length_median=2.95329200e-02,length_95%HPD={4.00162700e-04,1.21659700e-01}],(c[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01597132[&length_mean=2.52841160e-02,length_median=1.59713200e-02,length_95%HPD={1.22323600e-04,8.13523900e-02}],d[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01613772[&length_mean=2.77885371e-02,length_median=1.61377200e-02,length_95%HPD={7.01799100e-05,8.90282400e-02}])[&prob=7.12383489e-01,prob(percent)=\"71\"]:0.0192483[&length_mean=2.94655619e-02,length_median=1.92483000e-02,length_95%HPD={2.93826500e-04,8.69032400e-02}])[&prob=5.32623169e-01,prob(percent)=\"53\"]:0.01115527[&length_mean=1.85044450e-02,length_median=1.11552700e-02,length_95%HPD={3.02858700e-05,5.96261500e-02}]);"): name too long (ENAMETOOLONG)
IOError: stat("(a[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.04092275[&length_mean=5.43140517e-02,length_median=4.09227500e-02,length_95%HPD={8.67395700e-04,1.49905400e-01}],(e[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01460944[&length_mean=2.56496525e-02,length_median=1.46094400e-02,length_95%HPD={2.92031800e-05,7.78917800e-02}],f[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01559312[&length_mean=2.55470080e-02,length_median=1.55931200e-02,length_95%HPD={7.18743900e-05,8.11710000e-02}])[&prob=7.27030626e-01,prob(percent)=\"73\"]:0.01827144[&length_mean=3.10664235e-02,length_median=1.82714400e-02,length_95%HPD={9.34787800e-05,1.07984300e-01}],(g[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.0155863[&length_mean=2.54705410e-02,length_median=1.55863000e-02,length_95%HPD={1.78966700e-04,7.89868000e-02}],h[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01743166[&length_mean=2.66282161e-02,length_median=1.74316600e-02,length_95%HPD={2.07046800e-04,8.19463400e-02}])[&prob=7.25699068e-01,prob(percent)=\"73\"]:0.02024421[&length_mean=3.03930357e-02,length_median=2.02442100e-02,length_95%HPD={1.71152900e-05,9.46964500e-02}],(b[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.02953292[&length_mean=4.06685490e-02,length_median=2.95329200e-02,length_95%HPD={4.00162700e-04,1.21659700e-01}],(c[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01597132[&length_mean=2.52841160e-02,length_median=1.59713200e-02,length_95%HPD={1.22323600e-04,8.13523900e-02}],d[&prob=1.00000000e+00,prob(percent)=\"100\"]:0.01613772[&length_mean=2.77885371e-02,length_median=1.61377200e-02,length_95%HPD={7.01799100e-05,8.90282400e-02}])[&prob=7.12383489e-01,prob(percent)=\"71\"]:0.0192483[&length_mean=2.94655619e-02,length_median=1.92483000e-02,length_95%HPD={2.93826500e-04,8.69032400e-02}])[&prob=5.32623169e-01,prob(percent)=\"53\"]:0.01115527[&length_mean=1.85044450e-02,length_median=1.11552700e-02,length_95%HPD={3.02858700e-05,5.96261500e-02}]);"): name too long (ENAMETOOLONG)







Stacktrace:



 [1] uv_error



   @ ./libuv.jl:100 [inlined]



 [2] stat(path::String)



   @ Base.Filesystem ./stat.jl:152



 [3] ispath(path::String)



   @ Base.Filesystem ./stat.jl:461



 [4] readnw(s::String, I::Type)



   @ NewickTree ~/.julia/packages/NewickTree/sFG3q/src/parser.jl:22



 [5] readnw(s::String)



   @ NewickTree ~/.julia/packages/NewickTree/sFG3q/src/parser.jl:17



 [6] top-level scope



   @ ~/projects/Doran_etal_2023/notebooks/02_figure_02supp.ipynb:2

method = "RAxML"
run(pipeline(`$(gotree()) draw svg -c -w 400 -H 400 --support-cutoff .5 --with-branch-support`,
    stdin=projectdir("_research", "toyMSA_8x14", method, "$method-supporttree.nw"),
    stdout=joinpath(pdir_toyMSA_8x14, method * ".svg")
))
show_svg(joinpath(pdir_toyMSA_8x14, method * ".svg"))

method = "PhyML"
run(pipeline(`$(gotree()) draw svg -c -w 400 -H 400 --support-cutoff .5 --with-branch-support`,
    stdin=projectdir("_research", "toyMSA_8x14", method, "MSA.phylip-supporttree.txt"),
    stdout=joinpath(pdir_toyMSA_8x14, method * ".svg")
))
show_svg(joinpath(pdir_toyMSA_8x14, method * ".svg"))

method = "RAxML"
run(pipeline(`$(gotree()) draw svg -c -w 400 -H 400 --support-cutoff .5 --with-branch-support`,
    stdin=projectdir("_research", "toyMSA_8x14", method, "$method-supporttree.nw"),
    stdout=joinpath(pdir_toyMSA_8x14, method * ".svg")
))
show_svg(joinpath(pdir_toyMSA_8x14, method * ".svg"))

method = "MrBayes"
run(pipeline(`$(gotree()) draw svg -c -w 400 -H 400 --support-cutoff .5 --with-branch-support`,
    stdin=projectdir("_research", "toyMSA_8x14", method, "MSA-supporttree.nw"),
    stdout=joinpath(pdir_toyMSA_8x14, method * ".svg")
))
show_svg(joinpath(pdir_toyMSA_8x14, method * ".svg"))

method = "SPI"
run(pipeline(`$(gotree()) draw svg -c -w 400 -H 400 --support-cutoff .5 --with-branch-support`,
    stdin=projectdir("_research", "toyMSA_8x14", method, "MSA-supporttree.nw"),
    stdout=joinpath(pdir_toyMSA_8x14, method * ".svg")
))
show_svg(joinpath(pdir_toyMSA_8x14, method * ".svg"))

Fig. S3 - Local parameter sweep

Does spectral inference find the appropriate tree regardless of the number of taxa or number of features?

Spectral inference works in the limit that we have enough features to describe all the extant sub-populations. i.e., generally when Nfeatures >= Ntaxa. If features are carefully chosen its possible to use fewer features.

using Gotree_jll, Goalign_jll, SeqGen_jll

treedir = datadir("sims", "localsweep", "trees") |> mkpath

numleaves = [16, 32, 64, 128, 256, 512, 1024]
nfeatures = [16, 32, 64, 128, 256, 512, 1024]
baldepth = Int.(log2.(numleaves));
const seed = 42;

Make randomly generated trees. The branch lengths for each random tree are selected from an exponential distribution and trees from GTDB are selected based on number of leaves an internal node posesses.

# branch lengths for each random tree are selected from an exponential distribution
for (d, nl) in zip(baldepth, numleaves)
    run(pipeline(`$(gotree()) generate balancedtree --seed $(seed) -d $d`,
    `$(gotree()) brlen clear`, 
    `$(gotree()) brlen setmin -l 1.0 -o $(treedir)/balancedtree-t$(nl)-s$(seed).nw`
    ))
end

## Make MSAs from generated trees
msadir = datadir("sims", "localsweep", "MSAs")
rm(msadir, force=true, recursive=true)
mkpath(msadir)

# Nbits = 2 (binary)
for Nf in nfeatures
    mkpath(msadir)
    for f in readdir(treedir)
        fn = first(split(f, "."))
        run(pipeline(`$(seqgen()) -q -z$(seed) -s $(2/Nf) -or -l$Nf -mHKY -f0.5,0.0,0.0,0.5`,
            stdin=joinpath(treedir, f),
            stdout=joinpath(msadir, fn * "-l$Nf-b2.phy")))
    end
end

# Run SpectralInference on balenced trees
inputdir = datadir("sims", "localsweep", "MSAs")
inputfiles = joinpath.(inputdir, readdir(inputdir))
simnames = first.(split.(basename.(inputfiles), "."))
outputdir  = projectdir("_research", "localsweep", "runSPI")
outputdirs = joinpath.(outputdir, simnames)

for (inputfile, simname, outdir) in zip(inputfiles, simnames, outputdirs)
    mkpath(outdir)
    run(pipeline(`julia -t 4 $(projectdir("scripts", "runners", "runSPI.jl")) \
        -i $inputfile \
        -o $outdir \
        --nboot 100`, 
        stdout=joinpath(outdir, "runSPI.out")))
end

  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`
  Activating project at `~/projects/Doran_etal_2023`

### SPI METRICS
sourcetrees = joinpath.(datadir(), "sims", "localsweep", "trees", replace.(basename.(inputfiles), r"-l[0-9].*"=>".nw"))
metricdf = DataFrame(
    :treefile=>sourcetrees,
    :msafile=>inputfiles,
    :msaname=>simnames,
    :outputdir=>outputdirs,
    :ntaxa=>parse.(Int, replace.(simnames, r"(.*)((?<=-t)[0-9]*)(.*)"=>s"\2")),
    :nfeatures=>parse.(Int, replace.(simnames, r"(.*)((?<=-l)[0-9]*)(.*)"=>s"\2")),
    :nbits=>parse.(Int, replace.(simnames, r"(.*)((?<=-b)[0-9]*)(.*)"=>s"\2")),
)
metricdf = transform(metricdf, :nbits => (x-> x==20 ? "AA" : "DNA") => :chartype)

@info "Reading original source trees for SPI..."
origtrees = readnw.(read.(metricdf.treefile, String));
as_polytomy!.(n->distance(n)<1e-8, origtrees);

predtreesfiles = joinpath.(projectdir(), "_research", "localsweep", "runSPI", metricdf.msaname, metricdf.msaname .* "-supporttree.nw")
@info "Reading in SPI predicted trees..."
predtrees = readnw.(read.(predtreesfiles, String));
as_polytomy!.(n->NewickTree.support(n)<.5, predtrees);
@info "Calculate SPI Fscore Precision & Recall..."
proformencemetrics = fscore_precision_recall.(origtrees, predtrees)
proformencemetrics = hcat(vcat.(proformencemetrics...)...)
metricdf[!,:SPI_fscore] .= proformencemetrics[:, 1]
metricdf[!,:SPI_precision] .= proformencemetrics[:, 2]
metricdf[!,:SPI_recall] .= proformencemetrics[:, 3]

@info "Calculating SPI Branch Depth"
mmmdepths = map(predtrees) do tr
    dists = mapinternalnodes(network_distance, tr, tr)
    return [mean(dists), median(dists), maximum(dists)]
end
mmmdepths = hcat(mmmdepths...)'
metricdf[!,:SPI_meandepth] .= mmmdepths[:, 1]
metricdf[!,:SPI_mediandepth] .= mmmdepths[:, 2]
metricdf[!,:SPI_maxdepth] .= mmmdepths[:, 3]

mkpath(datadir("exp_pro", "localsweep"))
CSV.write(datadir("exp_pro", "localsweep", "metricsSPI.csv"), metricdf)

┌ Info: Reading original source trees for SPI...
└ @ Main /Users/bend/projects/Doran_etal_2023/notebooks/02_figure_02supp.ipynb:14
┌ Info: Reading in SPI predicted trees...
└ @ Main /Users/bend/projects/Doran_etal_2023/notebooks/02_figure_02supp.ipynb:19
┌ Info: Calculate SPI Fscore Precision & Recall...
└ @ Main /Users/bend/projects/Doran_etal_2023/notebooks/02_figure_02supp.ipynb:22
┌ Info: Calculating SPI Branch Depth
└ @ Main /Users/bend/projects/Doran_etal_2023/notebooks/02_figure_02supp.ipynb:29

"/Users/bend/projects/Doran_etal_2023/data/exp_pro/localsweep/metricsSPI.csv"

pdir = plotsdir("localsweep")
mkpath(pdir)
df = CSV.read(datadir("exp_pro", "localsweep", "metricsSPI.csv"), DataFrame);
CSV.write(datadir("exp_pro", "localsweep", "TableS1.csv"),
    df[:, [:msaname, :ntaxa, :nfeatures, :chartype, :SPI_fscore, :SPI_recall, :SPI_precision]]
)
pdf = df[:, [:ntaxa, :nfeatures, :SPI_fscore]] |>
    df->unstack(df, :nfeatures, :SPI_fscore) |>
    df->sort(df, :ntaxa) |>
    df->select(df, ["ntaxa", "16", "32","64", "128", "256", "512", "1024"])
heatmap(Matrix(pdf[!,2:end]), 
    ticks=(1:7, [ "16", "32", "64", "128", "256", "512", "1024"]),
    xlabel="nfeatures",
    ylabel="ntaxa",
    clims=(0,1),
    title="SPI fscore",
    c=:viridis)

Fig. S4 - Understanding SVD math

How similarity and dissimilarity are encoded into spectral components

We sought to understand the mathematics of why spectral factorization reveals hierarchical scales of relatedness. In answer, we found it is because the similarity and differences between sub-populations are split across separate spectral components. Specifically, we show that (i) an ensemble of systems with two sub-populations will have exactly 2 irreducible spectral components up to the exact point at which those populations become identical; (ii) the change in magnitude for these spectral components — their eigenvalues — are equal and opposite to each other as we increase the degree of relatedness between the sub-populations; and (iii) the major eigenvector encodes the similarity of the sub-populations and the lesser eigenvector encodes the sub-population’s dissimilarity for all points between the extrema where the sub-populations are identical or completely independent.

As it is not guaranteed that the reader has a background in the required mathematics, we will split this section into two parts. Section §4.1 will delve into the necessary detail regarding eigenvalue decomposition, the determinant, and the characteristic polynomial for readers to understand the connections between these concepts. Section §4.2 will detail a specific case of an ensemble of 3 systems, and show how changing the degree of relatedness between these systems changes specific aspects of the eigenspectrum.

Linear Algebra Background

What are eigenvectors?

For readers unfamiliar with linear algebra, some of the early applications for eigen decomposition in the 1700s were developed to describe the linear transformations of physical systems (rotations, shifting, scaling, and shearing of rigid bodies) ¹. Of particular interest in these descriptions are the principle axes or “eigenvectors” of the transformation which are the only vectors that do not change direction during the transform. The eigenspectrum describes the complete set of axes “eigenvectors”, and is defined as the non-zero solutions to this equation

\[ C\vec{v} = \lambda\vec{v} \tag{1}\]

In Equation 1, \(C\) is the linear transformation, represented as a matrix of real numbers; \(\vec{v}\) is the eigenvector, represented as a list of real numbers; and \(\lambda\) is the eigenvalue, a real number that shortens or lengthens the eigen vector.

Note

As an example, this is one specific case of Equation 1.

\[ \begin{bmatrix}2&0\\0&2\end{bmatrix}\begin{bmatrix}1\\1\end{bmatrix} = 2\begin{bmatrix}1\\1\end{bmatrix} \]

We can see in performing the matrix multiplications that both of these expressions are equal to the vector \([2,2]^t\)

\[ \begin{align} \begin{bmatrix}2&0\\0&2\end{bmatrix}\begin{bmatrix}v_1\\v_2\end{bmatrix} &= \begin{bmatrix}(2\times v_1) + (0\times v_2)\\(0\times v_1)+(2\times v_2)\end{bmatrix} \\ &= \begin{bmatrix}(2\times 1) + (0\times1)\\(0\times1)+(2\times1)\end{bmatrix} \\ & = \begin{bmatrix}2\\2\end{bmatrix} \\ &= 2\begin{bmatrix}1\\1\end{bmatrix} \end{align} \]

The eigenvectors and eigenvalues are useful precisely because they are the only stable descriptors of the transformation and can be used to consistently describe positions both before and after the transform.

To mathematically solve for these eigenvectors, one common technique is to first find the eigenvalues which are the solutions to Equation 2, and then substitute these eigenvalues into Equation 1 and solve for each of the eigenvectors (\(\vec{v}\)). All applications of spectral factorization (i.e. SVD and PCA) use this fundamental equation to define their spectral components (‘eigenvectors’).

\[ \det(C - \lambda I) = 0 \tag{2}\]

In Equation 2, \(C\) is again the linear transformation; \(I\) is the identity matrix of the same size as \(C\), defined as having \(1\)s along the diagonal and \(0\) for all other entries; \(det()\) is the determinant function which describes a change in the degrees of freedom or number of dimensions after a linear transformation; and \(\lambda\) is the variable that we are trying to infer (i.e., \(\lambda\) equals the eigenvalue when Equation 2 is true).

As we will see, Equation 2 can also be expressed as the “characteristic polynomial” of the transform.

\[ \det(\lambda I - C_{n\times n}) = \lambda^n - c_{n-1}\lambda^{n-1} + c_{n-2}\lambda^{n-2} - c_{n-3}\lambda^{n-3}... = 0 \tag{3}\]

The characteristic polynomial is a monic alternating sign polynomial of degree \(n\) where \(n\) is the minimum dimension of \(C\). A monic polynomial means that the leading coefficient is always equal to \(1\). And alternating sign means that the sign of each subsequent term alternates in sign from \(+\) to \(-\) and back. The roots of this polynomial, (i.e., the values of \(\lambda\) that set it equal to zero) are the eigenvalues of Equation 2.

Note

Note the change from \(\det(C - \lambda I)\) in Equation 2 to \(\det(\lambda I - C)\) in Equation 3.

This factor of \(-1\) ensures that the characteristic polynomial is always expressed with a positive leading coefficient. The definition in Equation 2, \(\det(C - \lambda I)\) has a negative leading coefficient for any matrix \(C\) with an odd number of rows and columns.

It can be awkward to display the expansion of \(\det(\lambda I - C)\), so we forgo showing the \(-1\) factor in this document until the last step to finish with the canonical form of the characteristic polynomial.

Rational for using \(\det(C-\lambda I) = 0\)

To fully intuit the logic behind for using the equation \(\det(C-\lambda I) = 0\) to solve for eigenvalues, it will help to review how matrices in linear algebra describe transformation of space.

If we multiply a vector \(\vec{u}\) by a matrix \(C\) we will get a new vector that may be pointing in a new direction \(\vec{v}\).

\[ C\vec{u} = \vec{v} \]

To reiterate, there are special vectors for each matrix \(C\) that do not change direction after the transformation – \(C\vec{v} = \lambda\vec{v}\). Instead they only change in magnitude. These are exactly the eigenvectors of the matrix, and the magnitude (length of the vector) \(\lambda\) is the eigenvalue. These eigenvectors and values are particularly important for describing the transformation and describing how points move through the transformation because they are the only stable axes during the transformation.

Figure 1: example of linear transformation on regular vector (left) versus an eigen vector of that transformation (right)

To solve for the eigenvectors we can do some algebraic manipulations to factor out \(\vec{v}\) within Equation 1

\[ \begin{align} C\vec{v} &= \lambda\vec{v} \\ C\vec{v} - \lambda\vec{v} &= 0 \\ C\vec{v} - \lambda I \vec{v} &= 0 \\ (C - \lambda I )\vec{v} &= 0 \\ \end{align} \]

After these manipulations we have the eigenvector \(\vec{v}\) multiplied by \((C - \lambda I)\) a matrix of our original transform \(C\) subtracting out the eigenvalue on the diagonal.

Note

Expanding at this stage would show this form:

\[ (C - \lambda I)\vec{v} = \begin{bmatrix} c_{1,1} - \lambda&\cdots&c_{1,n} \\ \vdots&\ddots&\vdots \\ c_{n,1}&\cdots&c_{n,n}- \lambda\\ \end{bmatrix} \begin{bmatrix}v_1\\\vdots\\v_n\end{bmatrix} = 0 \]

Notice that making \(\vec{v}\) equal to the zero vector is a trivial solution for all matrices. Because this solution is true for all matrices, eigenvectors are generally defined as only the non-zero solutions to Equation 1.

So when is \(\vec{v}\) non-zero, and yet multiplying by \((C - \lambda I)\) results in zero? Our prior manipulations were useful for answering this question because they let us play a quick thought experiment. Lets take a moment to assume that \((C - \lambda I)\) is invertible. Invertible means that there exists some matrix \((C - \lambda I)^{-1}\) that would completely reverse the transformation and place every vector back where it started, (i.e., the transformation matrix is canceled by its inverse transformation). So \((C - \lambda I)^{-1}(C - \lambda I) = I\) the identity matrix because for any matrix \(\cancel{A^{-1}A}\vec{v} = I\vec{v} = \vec{v}\).

If we place this inverse into the eigenvector equation from before we see that if \((C - \lambda I)\) is invertible than \(\vec{v}\) must equal the zero vector.

\[ \begin{align} (C - \lambda I)\vec{v} &= 0 \\ (C - \lambda I)^{-1} (C - \lambda I)\vec{v} &= (C - \lambda I)^{-1}0 \\ I\vec{v} &= 0 \\ \vec{v} &= 0 \\ \end{align} \]

From this thought experiment we see that for \(\vec{v}\) to be non-zero \((C - \lambda I)\) must be non-invertible. A matrix being non-invertible essentially means that at least 2 vectors are placed in the same location after the transformation \(A\vec{u} = A\vec{w}\). In geometric terms, we can think of this as an reduction in dimensionality; a line being compressed into a single point, a plane being compressed into a line or point.

We can then interpret our search for eigenvalues as searching for intrinsic scales of dimensionality. We can expand a sphere from the origin and at particular radii we are canceling out dimensional axes, i.e. for particular values of \(\lambda\) we are subtracting the full magnitude of a principal axis in our observed data which flattens all observations along that axis to zero.

The determinant function is how we measure this collapse of dimensionality. In the 2 dimensional case,the geometric interpretation behind the determinant is that it measures the area of the unit square after a transformation by matrix \(C\). If the transform \(C\) compresses all the points in the area of the unit square onto a single line or point, (a) the area of the unit square after the transform equals zero \(\det(C)=0\), (b) this transform is not invertible. The transform is not invertible because multiple points are moved onto the same coordinate by the transform, and so to move every point back to its original location would require information lost by the transform. The determinant is used for finding eigenvalues because it is the exact measure of when a transform is non-invertible (i.e. has a loss of dimensionality).

Figure 2: example of unit square after different linear transforms. The last case (right) shows example of the 2d plane being compressed to a single line at which point the determinant equals zero

The determinant defined

Analytically, the determinant of a 2 x 2 matrix is defined as:

\[ \det{\begin{pmatrix} a & b \\ c & d \end{pmatrix}} = \begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc \]

And it is defined recursively for larger matrices.

\[ \det(C_{n\times n}) = \sum_{j=1}^n \sigma_j C_{1,j} \det(C_{-1, -j}) \tag{4}\]

where \(\sigma_j\) is defined as \(+1\) if \(j\) is odd and \(-1\) if \(j\) is even; \(C_{-1, -i}\) is an \(n-1 \times n-1\) matrix made by removing the first row and \(j\)th column.

For example, here is the first layer of recursion for a 3 x 3 matrix, which is defined as an alternating sum of 2 x 2 determinants weighted by each complementary element of the top row.

\[ \left|\begin{array}{ccc} a & b & c \\ d & e & f \\ g & h & i \\ \end{array}\right| \\ = a\left|\begin{array}{cc}e&f\\h&i\end{array}\right| - b\left|\begin{array}{cc}d&f\\g&i\end{array}\right| + c\left|\begin{array}{cc}d&e\\g&h\end{array}\right| \]

Expanded fully, the determinant forms a sum of \(n\) factorial (\(n!\)) products where \(n\) is the number of columns in the matrix.

\[ aei - afh - bdi + bfg + cdh - ceg \]

The relationship between determinant and characteristic polynomial

When all elements of the matrix are known, this sum of products collapses down to a single number: the result of the determinant. When there is an unknown variable in the matrix, like with with our eigenvalue problem in Equation 2, the determinant instead collapses to a polynomial of the unknown variable. This polynomial is the characteristic polynomial of the matrix.

As we can show with a 3x3 example:

\[ \begin{align} 0 &= \det(C - \lambda I) \\ &= \left|\begin{array}{ccc} a - \lambda & b & c \\ d & e - \lambda & f \\ g & h & i - \lambda \\ \end{array}\right| \\ &= (a - \lambda)\left|\begin{array}{cc}e - \lambda&f\\h&i - \lambda\end{array}\right| - b\left|\begin{array}{cc}d&f\\g&i - \lambda\end{array}\right| + c\left|\begin{array}{cc}d&e - \lambda\\g&h\end{array}\right| \end{align} \]

When expanded into the alternating sum of \(n!\) terms we see that some of these terms have more instances of \(\lambda\) than others

\[ (a - \lambda)(e - \lambda)(i - \lambda) - (a - \lambda)fh - bd(i - \lambda) + bfg + cdh - c(e - \lambda)g \]

We can sort by the number of instances of \(\lambda\)

\[ (a - \lambda)(e - \lambda)(i - \lambda) - fh(a - \lambda) - bd(i - \lambda) - cg(e - \lambda) + bfg + cdh \]

To continue, let’s make this more specific and take it term by term.

For this example we will use the matrix

\[ C = \begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix} \]

So for this first term we can expand

\[ \begin{align} (a - \lambda)(e - \lambda)(i - \lambda) &= (1 - \lambda)(1 - \lambda)(1 - \lambda) \\ &= -\lambda^3 + 3\lambda^2 - 3\lambda + 1 \end{align} \]

for the second, third, and fourth terms we get

\[ \begin{align} -fh(a - \lambda) &= \\ -bd(i - \lambda) &= \\ -cg(e - \lambda) &= \\ -(1\times 1)(1 - \lambda) &= \lambda - 1 \end{align} \]

and our fifth and sixth terms both equal \(+1\).

\[bfg = cdh = 1\cdot 1 \cdot 1 = 1\]

Putting them together we get

\[ \begin{align} &(-\lambda^3 + 3\lambda^2 - 3\lambda + 1) + (\lambda - 1) + (\lambda - 1) + (\lambda - 1) + 1 + 1 \\ &= -\lambda^3 + 3\lambda^2 - 3\lambda + \lambda + \lambda + \lambda - 1 - 1 - 1 + 1 + 1 + 1 \\ &= -\lambda^3 + 3\lambda^2 - 3\lambda + 3\lambda + 3 - 3 \\ &= -\lambda^3 + 3\lambda^2 \end{align} \]

To finish, we multiply by -1 to get the canonical form of the characteristic polynomial because the characteristic polynomial is technically defined by \(\det(\lambda I - C )\) rather than \(\det(C - \lambda I)\)

\[ \det(\lambda I - C) = \lambda^3 - 3\lambda^2 \text{ when } C = \begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix} \]

What we have described so far is that (i) eigenvectors are the stable principal axes of linear transformations; (ii) The determinant is used to solve for eigenvalues (and by proxy eigenvectors) because it measures when a principal axis (i.e., eigenvector) has been collapsed to zero and canceled out; (iii) the the eigenvalue problem in Equation 2 can be equivalently represented by the characteristic polynomial simply by expanding the determinant of a matrix with an unknown variable.

The eigenspectrum and singular value decomposition

This section has so far discussed how to find an individual eigenvector-eigenvalue pair (‘spectral component’). But it is important to remember that the original transform \(C\) is made up of the set of all spectral components. This concept is important for understanding how spectral factorization can describe data and physical observations rather than abstract linear transforms. Specifically, the full set of spectral components allows us to factor any arbitrary matrix into separate components of information.

To show this factorization, we start with the transform \(C\) and right multiply by its collection of eigenvectors \(V\) this has the effect of scaling each eigenvector by it’s associated eigenvalue. contained in matrix \(\Lambda\) with each \(\lambda_i\) on the diagonal. (this is exactly because these are the vectors that do not change direction and are only scaled by the transform.)

\[ CV = V\Lambda \]

Because the collection of eigenvectors are a set of linearly independent unitary vectors (each vector has a length of \(1\) and is pointed in a mutually orthogonal direction – at right angle – to all other eigenvectors), we know that \(V\) has an inverse and that inverse is its transpose \(VV^{-1} = VV^{t} = V^{t}V = I\)

\[ C = V\Lambda V^t \]

This equation tells us how to recreate our linear transform from its set of spectral components. More explicitly expanding this matrix multiplication, the transform \(C\) can be equivalently expressed as the sum

\[ C = V\Lambda V^t = \sum_i \lambda_i v_i v_i^t \]

Here \(\lambda_i v_i\) is the \(i\)th scaled eigenvector and \(v_i^t\) is its transpose. The multiplication \(\lambda_i v_i v_i^t\) tells us is how each original axis (usually measured traits) in \(v_i^t\) gets transformed by – projected onto – this spectral component \(lambda_i v_i\). It results in a matrix the same size as the original transform \(C\) and is called a rank 1 transform because it is composed of only a single vector. This representation emphasizes that each spectral component is describing one layer of information about the original matrix \(C\). Taken together the full set of spectral components completely recapitulates the original matrix.

Let us now introduce the singular value decomposition (SVD). The SVD is an extension of eigen decomposition for non-square matrices. And allows the factorization of any matrix into 3 new matrices; \(U\) the left singular vectors; \(\Sigma\) a matrix with the singular values along the diagonal; and \(V^t\) the transposed right singular vectors.

\[ M = U\Sigma V^t \tag{5}\]

To briefly see the relation to the eigen decomposition we can write

\[ \begin{align} MM^t &= U\Sigma V^t V \Sigma U^t \\ MM^t &= U\Sigma \cancel{V^t V} \Sigma U^t \\ MM^t &= U\Sigma^2 U^t \\ \end{align} \]

where \(MM^t\) is a matrix times its transpose – for scaled and centered matrices this is equivalent to the row-wise covariance matrix; \(U\) is both the left singular vectors of \(M\) and the eigenvectors of \(MM^t\); and \(\Sigma\) is a matrix with the singular values on the diagonal elements and \(\Sigma^2\) are the eigenvalues of \(MM^t\).

Note

Covariance is the same as a matrix multiplication when a matrix is scaled and centered on the origin

matrix multiplication of \(X\) with \(i\) rows and \(n\) columns and \(Y\) with \(n\) rows and \(j\) columns

\[ XY = \begin{bmatrix} \sum_n x_{1n}y_{n1} & \cdots & \sum_n x_{1n}y_{nj} \\ \vdots & \ddots & \vdots \\ \sum_n x_{in}y_{n1} & \cdots & \sum_n x_{in}y_{nj} \\ \end{bmatrix} \\ \]

For each element in the resulting matrix we can get to the covariance by scaling by the number of measurements \(\frac{1}{n}\) and subtracting each mean \(\bar{x}\) and \(\bar{y}\).

\[ X_iY_j = \sum_{i=n}^n(x_{in})(y_{jn}) \rightarrow \frac{1}{n}\sum_{i=n}^n(x_{in} - \bar{x})(y_{jn} - \bar{y}) =\text{cov}(X_i, Y_j) \]

How relatedness affects the eigenspectrum

Creating an ensemble of systems with varying degrees of relatedness

In this section we will use the concepts relating the eigenspectrum, the determinant, and the characteristic polynomial to show how the eigenspectrum changes as we increase the relatedness between systems. We purposefully use the term “system” because it is general and no property discussed is specifically tied to particular scale or field of physical science. However, to aid intuition we will pose the initial cases in terms of evolutionary history and genetics.

A system in this case represents a biological organism that can be described in terms of genetic traits. For simplicity we can simulate an organism’s genome with a vector of \(1\)’s and \(0\)’s, where a \(1\) represents the presence of a genetic trait and a \(0\) represents the absence. For example organism \(a\) can be represented as

\[ a = \begin{bmatrix}1&1&1&0&0&0\end{bmatrix} \]

where organism \(a\) possesses the first three genetic traits and lacks the last 3.

Likewise, an ensemble of systems or a population of organisms can be represented as a binary matrix where rows represent each organism and each column represents a particular genetic trait. for example

\[ \begin{bmatrix}c\\b\\a\end{bmatrix} = \begin{bmatrix}1&1&1&0&0&0\\1&1&1&0&0&0\\1&1&1&0&0&0\end{bmatrix} = M_{similar} \]

In this example organisms \(a\), \(b\), and \(c\) are all clones of each other they share the presence of the first 3 genetic traits and the absence of the last 3 genetic traits. This is a case of the ensemble or population being identical or “similar” to each other.

At the other extreme we could find an ensemble like this

\[ \begin{bmatrix}c\\b\\a\end{bmatrix} = \begin{bmatrix}0&0&0&1&1&1\\1&1&1&0&0&0\\1&1&1&0&0&0\end{bmatrix} = M_{modular} \]

Here the full ensemble is split into two sub-populations with no shared genetic traits between \(c\) and \(\{a, b\}\). This is a ‘modular’ case in the sense that these two sub-populations are completely disconnected in terms of shared information. The descriptors of one sub-population are completely orthogonal to the other; the two populations tell us nothing about each other.

The most common case for biological population is somewhere between these two extremes. We may find some organisms that are quite similar, along with other organisms that are more distantly related but still share some genetic traits

\[ \begin{bmatrix}c\\b\\a\end{bmatrix} = \begin{bmatrix}0&0&0&1&1&1\\1&1&0&0&0&1\\1&1&0&0&0&1\end{bmatrix} = M_{related} \]

Our questions are how does the eigenspectrum change as we adjust the degree of relatedness between systems in the ensemble, and where does the information about that relatedness fall in the eigenspectrum.

To calculate the eigenspectrum of each ensemble we first calculate the matrix \(MM^t\). The matrix \(MM^t\) is a square similarity matrix between each pair of organisms, or more generally between each pair of rows of \(M\). This is the standard first step toward calculating the SVD and recall from section §4.1.5 it is how we can calculate the eigenspectrum of rectangular matrices.

We calculate the similarity matrices for each of the degrees of relatedness.

\[ \begin{align*} M_{similar} &= \begin{bmatrix}3&3&3\\3&3&3\\3&3&3\\\end{bmatrix} \\ M_{related} &= \begin{bmatrix}3&1&1\\1&3&3\\1&3&3\\\end{bmatrix} \\ M_{modular} &= \begin{bmatrix}3&0&0\\0&3&3\\0&3&3\\\end{bmatrix} \\ \end{align*} \]

Here, each number in the matrix represents the number of shared genetic traits between each pair of organisms. And, note how the only numbers that change are between \(c\) (top) and \(\{a, b\}\) (bottom, middle respectively). Because of this isolation, we can easily parameterize this changing degree of relatedness as the variable gamma \(\gamma\), which we can smoothly vary to understand the changes in the eigenspectrum.

Specifically, we are first interested in the changes to this equation with respect to \(\gamma\)

\[ \det\left(\begin{bmatrix} \langle c|c \rangle-\lambda&\gamma&\gamma\\ \gamma&\langle b|b \rangle-\lambda&s\\ \gamma&s&\langle a|a \rangle-\lambda\\ \end{bmatrix}\right) = 0 \tag{6}\]

where, \(\langle c|c \rangle\) is the self similarity of organism \(c\) with itself. In our cases \(c\) shares \(3\) genetic traits with itself so \(\langle c|c \rangle = 3\). Likewise, \(\langle a|a\rangle = \langle b|b\rangle = s = 3\). We are using these variables help track which pairs of organisms contribute to the terms in the determinant and coefficients of the characteristic polynomial.

How do the eigenvalues change in response to a changing degree of relatedness?

We sought to understand how the magnitudes of the eigenvalues are dependent on the degree of relatedness between sub-populations of systems.

To start, we inserted our example’s variables into the first layer of the expanded determinant.

\[ \begin{align*} (\langle c|c \rangle-\lambda)\left|\begin{array}{cc}(\langle b|b \rangle- \lambda)&s\\ s&(\langle a|a \rangle-\lambda)\end{array}\right| - \gamma\left|\begin{array}{cc}\gamma&s\\\gamma&(\langle a|a \rangle-\lambda)\end{array}\right| + \gamma\left|\begin{array}{cc}\gamma&(\langle b|b \rangle-\lambda)\\\gamma&s\end{array}\right| \end{align*} \]

Note how if \(\gamma=0\), all the terms that compare the similarity of \(c\) to \(\{a, b\}\) are zeroed out, and the only remaining first term corresponds specifically to the similarity of \(c\) onto itself and separately the determinant of \(a\) and \(b\). Also note, how the \(\gamma\) variable is only in terms with a single \(\lambda\). What this means is that the only term in the resulting characteristic polynomial dependent on \(\gamma\) is the first order term of \(\lambda\).

Expanded and simplified we find that the characteristic polynomial including \(\gamma\) is

\[ \lambda^3 - 9\lambda^2 + (18 - 2\gamma^2)\lambda \]

which is indeed only dependent on \(\gamma\) in the first order term of \(\lambda\).

Substituting in and varying values of \(\gamma\), i.e., varying the degree of relatedness gives us different instances of the characteristic polynomial.

Modular (\(\gamma = 0\)): \(\lambda^3 - 9\lambda^2 + 18\lambda\)
Related (\(\gamma = 1\)): \(\lambda^3 - 9\lambda^2 + 16\lambda\)
Related (\(\gamma = 2\)): \(\lambda^3 - 9\lambda^2 + 10\lambda\)
Similar (\(\gamma = 3\)): \(\lambda^3 - 9\lambda^2\)

We note a trend in the first order term of \(\lambda\); as the sub-populations become more related the first order coefficient in the polynomial gets closer and closer to zero. Once the sub-populations are identical the first order coefficient equals zero.

How does this changing coefficient change the roots (‘eigenvalues’) of the polynomial?

Plotting the determinant polynomial with respect to \(\lambda\) and more specifically the roots of the polynomals with respect to \(\gamma\) shows a pattern. If \(c\) is completely dissimilar to \(a\) and \(b\), the first order coefficient, \(c_{n-2}\), is greater than zero and there are two necessarily non-zero eigenvalues: \(λ_1\) and \(λ_2\). Because \(a\) and \(b\) are completely similar in our example, λ_3 is always necessarily zero.

As the similarity of \(c\) to \(a\) and \(b\) increases and \(c_{n-2}\) goes to zero, there are constraints on how the set of eigenvalues change. Computing the eigenvalues as a function of similarity of \(c\) to \(a\) and \(b\) show that the two eigenvalues change in opposite directions: \(λ_1\) becomes more positive while \(λ_2\) becomes more negative. This relationship between \(λ_1\) and \(λ_2\) is a natural mathematical consequence of solving for the roots of a polynomial with maximum degree three and minimum degree 1.

(left) Characteristic polynomial at

3 different degrees of relatedness between organisms.

Figure 3: (right) Roots of the characteristic polynomial while smoothly varying \(γ\)

We see that if we start with the characteristic polynomial including the degree of relatedness variable \(\gamma\).

\[ \lambda^3 - 9\lambda^2 + (18 - 2\gamma^2)\lambda \]

We can immediately factor out a root and power of \(\lambda\).

\[ (\lambda - 0)(\lambda^2 - 9\lambda^1 + (18 - 2\gamma^2)) \]

This is the third root of the eigenvalue problem and it is always equal to zero because we have set this example with \(a\) and \(b\) as identical.

Factoring out this root allows us to use the quadratic formula for the roots on the remaining 2nd order polynomial:

\[ \frac{-b \pm \sqrt{b^2 - 4ac}}{2a} \rightarrow \frac{9 \pm \sqrt{81 - 4(18 - 2\gamma^2)}}{2} = \frac{9 \pm \sqrt{8\gamma^2 + 9}}{2} \]

Here we see that the remaining two roots are generically expressed using quadratic formula

\[ (\lambda - 0)\left(\lambda - \frac{9 - \sqrt{8\gamma^2 + 9}}{2}\right)\left(\lambda - \frac{9 + \sqrt{8\gamma^2 + 9}}{2}\right) \]

Because \(\sqrt{8\gamma^2 + 9}\) will always be smaller than \(9\) while \(\gamma < (\langle a|a\rangle = 3)\), we can see that as \(\gamma\) approaches \(3\) the roots approach \(\{0, 0, +9\}\) respectively. And likewise, as \(\gamma\) approaches \(0\) the roots approach \(\{0, +3, +6\}\). These results make some intuitive sense when looking at Figure 3 which plots the determinant value with respect to \(\lambda\). When \(\gamma\) is large it drives the positive first order term of the polynomial to zero, which means that the negative second order term dominates for the region between \(\lambda = 0\) to approximately \(\lambda = 6\) until finally the positive third order term overtakes and dominates. In contrast when \(\gamma\) is small, the positive first order is present and can dominate the polynomial for small values of \(\lambda\) until the higher order terms begin to dominate the polynomial.

We also note two important facts:

First, the second eigenvalue

\[\left(\lambda - \frac{9 - \sqrt{8\gamma^2 + 9}}{2}\right)\]

will only equal zero when \(\gamma\) is exactly equal to \(3=\langle a|a \rangle =\langle b|b \rangle = s\) – only when \(c\) is exactly identical to \(a\) and \(b\). At any point where there is a difference in similarity and relatedness between these subpopulations there will be exactly \(2\) eigenvalues, and thus exactly \(2\) spectral components.

Second, the two eigenvalues (‘roots’) will change in equal and opposite directions as the degree of related \(\gamma\) changes. These dynamics are caused because the only occasion of \(\gamma\) in the quadratic formula is behind the \(\pm\) sign, we can see that as \(\gamma\) changes the roots of the characteristic polynomial have to change by the same amount in both the positive and negative directions. We also note that the second eigenvalue very quickly drops off, following a power-law on the order of \(\gamma^2\).

The results in this section support the idea that small eigenvalues—and by extension the spectral components they correspond to—are not noise. Up until sub-populations are exactly equal there are necessarily small spectral components, and because they roots change by at least the order of \(\gamma^2\) even not vary related sub-populations can have very spectral components corresponding to eigenvalues with extremely small magnitude.

Change of eigenvector contributions in a 3x3 ensemble of systems as a function of similarity

To better understand the relevance of the information being described by these minor spectral components, we computed the contribution of organisms \(a\), \(b\), and \(c\) to eigenvectors \(v_1\) and \(v_2\), defined by \(λ_1\) and \(λ_2\) respectively.

For eigenvector \(v_1\), we found that the contribution of \(a\) and \(b\) is relatively constant while the contribution of \(c\) rapidly changes from zero and asymptotically reaches the same constant as \(a\) and \(b\). In contrast, for eigenvector \(v_2\) we found that the contribution of \(c\) is relatively constant while the contribution of \(a\) and \(b\) rapidly changes from zero to asymptotically reach a constant value away from that of \(h\). These results indicate that the eigenvector \(v_1\) defines the similarity between \(a\), \(b\), and \(c\), whereas eigenvector \(v_2\) defines the difference between \(a\), \(b\), and \(c\). This relationship underlies using the eigenspectrum to define different scales of relatedness and is necessarily true independent of the percent variance harbored by each spectral component.

Yet, how is it that varying the similarity of \(c\) to \(a\) and \(b\) has different effects on the how each system relates to each other depending on which eigenvector is being considered?

Contribution of organisms onto eigenvectors \(v_1\) and \(v_2\) as the

Figure 4: degree of relatedness \(\gamma\) is varied.

We sought to understand why changing the similarity of \(c\) to \(a\) and \(b\) differentially affects the contribution of each system to eigenvectors \(v_1\) and \(v_2\). Solving the eigenvectors for the set of eigenvalues defines a system of equations relating the contribution of each system to eigenvectors \(v_1\) and \(v_2\) and the similarity of \(c\) to \(a\) and \(b\). The eigenvector equation \((C-\lambda I)\vec{v} = 0\) expanded into matrix notation is

\[ \begin{bmatrix} \langle c|c \rangle-\lambda&\gamma&\gamma\\ \gamma&\langle b|b \rangle-\lambda&s\\ \gamma&s&\langle a|a \rangle-\lambda\\ \end{bmatrix}\begin{bmatrix}x_c\\x_b\\x_a\end{bmatrix}= \begin{bmatrix}0\\0\\0\end{bmatrix} \tag{7}\]

We are interested in how the contributions \(\{x_a, x_b, x_c\}\) of each organism change with respect to \(\gamma\). This question necessitates calculating the partial derivative of the contribution of each system onto either eigenvector \(v_1\) or \(v_2\) with respect to the similarity of \(c\) to \(a\) and \(b\). We choose to calculate the derivative for each organism’s contribution starting at the modular example where the sub-populations are completely independent, \(\gamma = 0\)

To calculate these derivatives it will be helpful to first calculate the contribution of each organism at the modular case and for comparison at the similar case so that we have concrete numbers that we can substitute into the different variables.

Modular case: eigenvector \(v_1\)

we can start with the modular case’s set of equations.

\[ \begin{bmatrix} 3-\lambda&0&0\\ 0&3-\lambda&3\\ 0&3&3-\lambda\ \end{bmatrix}\begin{bmatrix}x_c\\x_b\\x_a\end{bmatrix}= \begin{bmatrix}0\\0\\0\end{bmatrix} \]

And then substitute the first root in the modular case \(\lambda = 6\)

\[ \begin{bmatrix} -3&0&0\\ 0&-3&3\\ 0&3&-3\\ \end{bmatrix}\begin{bmatrix}x_c\\x_b\\x_a\end{bmatrix}= \begin{bmatrix}0\\0\\0\end{bmatrix} \]

to solve for \(x_c\) we look to the top row’s equation

\[ -3x_c + 0x_b + 0x_a = 0 \]

which is solved by setting \(x_c = 0\)

the equations for \(x_b\) and \(x_a\) work together to show that they can equal any real number so long as \(x_a = x_b\)

\[ 3x_a - 3x_b = x_a - x_b = 0 \]

so to make \([x_c,x_b,x_a]^t\) a unitary vector (i.e., with length equal to \(1\)), we set \(x_a = x_b = \frac{1}{\sqrt{2}}\)

Modular case: eigenvector \(v_2\)

We can use the same process to find the contribution of each organism onto the second eigenvector with \(\lambda = 3\)

\[ \begin{bmatrix} 0&0&0\\ 0&0&3\\ 0&3&0\\ \end{bmatrix}\begin{bmatrix}x_c\\x_b\\x_a\end{bmatrix}= \begin{bmatrix}0\\0\\0\end{bmatrix} \]

to solve for \(x_c\) we look to the top equation

\[ 0x_c + 0x_b + 0x_a = 0 \]

and see that \(x_c\) can be any value.

the equations for \(x_b\) and \(x_a\) work together to show that they must both equal zero.

\[ 0x_c + 0x_b + 3x_a = x_a = 0 \]

\[ 0x_c + 3x_b + 0x_a = x_b = 0 \]

so to make \([x_c,x_b,x_a]^t\) a unitary vector (i.e., with length equal to \(1\)), we set \(x_a = x_b = 0\) and \(x_c = 1\)

Similar case: eigenvector \(v_1\)

We also can look to our similar case and find that all contributions are uniformly distributed along the first eigenvector, with no other non-zero eigenvalues or eigenvectors.

\[ \begin{bmatrix} -6&3&3\\ 3&-6&3\\ 3&3&-6\\ \end{bmatrix}\begin{bmatrix}x_c\\x_b\\x_a\end{bmatrix}= \begin{bmatrix}0\\0\\0\end{bmatrix} \]

these are permutations of the same equation

\[ x_a + x_b - 2x_c = 0 \]

\[ x_a + x_c - 2x_b = 0 \]

\[ x_c + x_b - 2x_a = 0 \]

and is solved with \(x_c = x_b = x_a\)

So, to make \([x_c,x_b,x_a]^t\) a unitary vector, we set \(x_a = x_b = x_c = \frac{1}{\sqrt{3}}\)

Gathering the calculations

After these calculations, we have this table of results showing the contribution of each organism onto eigenvectors both in the modular case and in the similar case

organism contribution	modular \(v_1\), \(\lambda=6\)	modular \(v_2\), \(\lambda=3\)	similar \(v_1\), \(\lambda=9\)	similar \(v_2\), \(\lambda=0\)
\(x_c\)	\(0\)	\(1\)	\(\frac{1}{\sqrt{3}}\)	\(0\)
\(x_b\)	\(\frac{1}{\sqrt{2}}\)	\(0\)	\(\frac{1}{\sqrt{3}}\)	\(0\)
\(x_a\)	\(\frac{1}{\sqrt{2}}\)	\(0\)	\(\frac{1}{\sqrt{3}}\)	\(0\)

These are the two extreme cases where the sub-populations are either completely distinct or completely identical. And what we see is that in the distinct case the contributions of each sub-population are partitioned onto separate eigenvalues, such that the larger sub-population \(\{a, b\}\) is on the larger eigenvector with no contribution from the smaller sub-population \(c\). Conversely, the smaller sub-population’s contribution is completely on the smaller eigenvector \(v_2\) with no contribution from the larger sub-population. Thus, once we have increased the degree of relatedness \(\gamma\) to the point both sub-populations are identical, the contribution of each organisms is uniformly weighted on the first eigenvector.

We can now get a sense of what the contributions mean and how different behavior arises on each eigenvector by exploring how the contributions change as we move away from the modular case by increasing the degree of relatedness \(\gamma\). Answering this question entails calculating the partial derivatives of \(\{x_a, x_b, x_c\}\) with respect to \(\gamma\). We will perform this calculation across both eigenvectors \(v_1\) and \(v_2\).

partial derivative: \(\frac{\partial x_c}{\partial \gamma}\)

Let’s start with \(x_c\)

On both \(v_1\) and \(v_2\) we will need to solve this equation.

\[ (\langle c|c \rangle-\lambda)x_c + \gamma x_b + \gamma x_a = 0 \]

we can isolate \(x_c\)

\[ \begin{align} (\langle c|c \rangle-\lambda)x_c + \gamma x_b + \gamma x_a &= 0 \\ (\langle c|c \rangle-\lambda)x_c &= -(\gamma x_b + \gamma x_a)\\ x_c &= \frac{-\gamma(x_b + x_a)}{(\langle c|c \rangle-\lambda)}\\ x_c &= \frac{\gamma(x_b + x_a)}{\lambda - \langle c|c \rangle}\\ \end{align} \]

Because we are starting from the modular case:

on \(v_1\) we know that \(x_a = x_b = \frac{1}{\sqrt{2}}\)
on \(v_2\) we know that \(x_a = x_b = 0\)

We then get a different partial derivative for each eigenvector; simply from the fact the \(x_a\) and \(x_b\) zero out the derivative on eigenvector \(v_2\) and are non-zero on \(v_1\).

On \(v_1\) we can substitute and simplify

\[ \begin{align*} x_c = \frac{\gamma(x_b + x_a)}{\lambda - \langle c|c \rangle} &= \frac{\gamma(\frac{1}{\sqrt{2}} + \frac{1}{\sqrt{2}})}{\lambda - \langle c|c \rangle} \\ &= \frac{2\gamma}{\sqrt{2}\left(\lambda - \langle c|c \rangle\right)} \\ x_c &= \frac{\sqrt{2}\gamma}{\lambda - \langle c|c \rangle} \\ \end{align*} \]

On \(v_2\) we can substitute and simplify

\[ \begin{align*} x_c = \frac{\gamma(x_b + x_a)}{\lambda - \langle c|c \rangle} &= \frac{\gamma(0 + 0)}{\lambda - \langle c|c \rangle} \\ x_c &= 0 \end{align*} \]

partial derivative: \(\frac{\partial x_b}{\partial \gamma}\), \(\frac{\partial x_a}{\partial \gamma}\)

In contrast, when we we start with \(x_b\)’s equation

\[ \gamma x_c + (\langle b|b \rangle-\lambda)x_b + s x_a = 0 \]

And isolate \(x_b\)

\[ \begin{align} \gamma x_c + (\langle c|c \rangle-\lambda)x_b + s x_a &= 0\\ x_b &= \frac{-(\gamma x_c + s x_a)}{(\langle b|b \rangle-\lambda)}\\ x_b &= \frac{(\gamma x_c + s x_a)}{\lambda - \langle b|b \rangle}\\ \end{align} \]

We see that on eigenvector \(v_1\), \(x_c=0\) so the partial derivative is zeroed out. And on eigenvector \(v_2\), because \(x_c=1\) the derivative with respect to \(\gamma\) is

\[ \frac{\partial x_b}{\partial \gamma} = \frac{1}{\lambda - \langle b|b \rangle} \]

Using the exact same procedure as for \(x_b\), we find for \(x_a\) that its partial derivatives are essentially identical

Pulling together all these partial derivatives we start to see a pattern, and explanation for how contributions of organisms can rise and fall on different eigenvectors

eigenvector	partial derivative \(x_c\)	partial derivative \(x_b\)	partial derivative \(x_a\)
\(\text{eigenvector } v_1\)	\(\frac{\partial x_c}{\partial \gamma} = \frac{\sqrt{2}}{\lambda_1 - \langle c\|c \rangle}\)	\(\frac{\partial x_b}{\partial \gamma} = 0\)	\(\frac{\partial x_a}{\partial \gamma} = 0\)
\(\text{eigenvector } v_2\)	\(\frac{\partial x_c}{\partial \gamma} = 0\)	\(\frac{\partial x_b}{\partial \gamma} = \frac{1}{\lambda_2 - \langle b\|b \rangle}\)	\(\frac{\partial x_a}{\partial \gamma} = \frac{1}{\lambda_2 - \langle a\|a \rangle}\)

We find that in the limit of a small increase in similarity of \(c\) to \(a\) and \(b\), the change in the contribution of \(c\) to eigenvector \(v_1\) is a constant while the that of \(a\) and \(b\) is zero. In contrast, the change in the contribution of \(c\) to eigenvector \(v_2\) is zero while that of \(a\) and \(b\) is a constant. Because \(λ_1\) and \(λ_2\) trend in opposite directions (and away from the singularity at \(\lambda=3\)), the contribution of \(c\) to eigenvector \(v_1\) smoothly tends towards that of \(a\) and \(b\) thereby defining relative system similarity, while the contributions of \(a\) and \(b\) to eigenvector \(v_2\) smoothly tend away from that of \(c\) thereby defining the relative dissimilarity between systems.

Footnotes

Hawkins, T. Cauchy and the spectral theory of matrices. Historia Mathematica 2, 1–29 (1975).↩︎