Periodic kernel #61

andreaskoher · 2021-04-09T11:04:49Z

From the ToDo list #59 I have tried to go ahead with the periodic kernel as proposed by Solin & Särkkä (2014). The periodic kernel is approximated by a Taylor series and I imagine there are multiple ways to implement the idea. One is to implement the approximated periodic kernel directly as a SimpleKernel but, here I chose to build it up from a sum of elementary oscillators (I am absolutely not sure which approach is more efficient though). Also, I assumed that the kernel takes only a one dimensional input as done in the paper, but probably this can be extended. Finally, I also had to modify lgssm_components(k::KernelSum ...) in order to allow for more than two kernel components.

struct OscillatorKernel{T,I} <: KernelFunctions.SimpleKernel
    l
end
struct ApproxPeriodicKernel{T} <: KernelFunctions.Kernel
    kernels::KernelFunctions.KernelSum
end

ApproxPeriodicKernel( k::PeriodicKernel{T}; n=3 ) where T =
    ApproxPeriodicKernel{T}( sum( OscillatorKernel{T,i}( 1/(4*only( k.r )^2) ) for i in 1:n ) )
#the parametrisation in the paper is slightly different from KernelFunctions: l -> 2r
ApproxPeriodicKernel(; r=1.) = ApproxPeriodicKernel(PeriodicKernel(r=[r]))

function TemporalGPs.to_sde(k::OscillatorKernel{T,I}, s::SArrayStorage{T}) where {T<:Real,I}
    F = SMatrix{2, 2, T}(0, π*I, -π*I, 0)
    q = convert(T, 2*besseli(I, k.l)/exp(k.l))
    H = SVector{2, T}(1, 0)
    return F, q, H
end
function TemporalGPs.to_sde(k::OscillatorKernel{T,0}, s::SArrayStorage{T}) where {T<:Real}
    F = SMatrix{2, 2, T}(0, 0, 0, 0)
    q = convert(T, besseli(0,k.l)/exp(k.l))
    H = SVector{2, T}(1, 0)
    return F, q, H
end

function TemporalGPs.stationary_distribution(k::OscillatorKernel{T,I}, s::SArrayStorage{T}) where {T<:Real,I}
    q = 2*besseli(I,k.l)/exp(k.l)
    m = SVector{2, T}(0, 0)
    P = SMatrix{2, 2, T}( q, 0, 0, q )
    return TemporalGPs.Gaussian(m, P)
end
function TemporalGPs.stationary_distribution(k::OscillatorKernel{T,0}, s::SArrayStorage{T}) where {T<:Real}
    q = besseli(0,k.l)/exp(k.l)
    m = SVector{2, T}(0, 0)
    P = SMatrix{2, 2, T}( q, 0, 0, q )
    return TemporalGPs.Gaussian(m, P)
end

function TemporalGPs.lgssm_components(k::KernelSum, ts::AbstractVector, storage_type::TemporalGPs.StorageType)
    n = length(k.kernels)
    As_l, as_l, Qs_l, emission_proj_l, x0_l = TemporalGPs.lgssm_components(k.kernels[1], ts, storage_type)
    for i in 2:n
        As_r, as_r, Qs_r, emission_proj_r, x0_r = TemporalGPs.lgssm_components(k.kernels[i], ts, storage_type)
        As_l = map(TemporalGPs.blk_diag, As_l, As_r)
        as_l = map(vcat, as_l, as_r)
        Qs_l = map(TemporalGPs.blk_diag, Qs_l, Qs_r)
        emission_proj_l = TemporalGPs._sum_emission_projections(emission_proj_l, emission_proj_r)
        x0_l = TemporalGPs.Gaussian(vcat(x0_l.m, x0_r.m), TemporalGPs.blk_diag(x0_l.P, x0_r.P))
    end
    return As_l, as_l, Qs_l, emission_proj_l, x0_l
end

TemporalGPs.lgssm_components( k::ApproxPeriodicKernel{T}, t::AbstractVector, storage_type::TemporalGPs.StorageType{T},
) where {T<:Real} = TemporalGPs.lgssm_components( k.kernels, t, storage_type )

a simple usage example:

Random.seed!(1234)
f = GP( 10*transform(PeriodicKernel(r=[.5]), 1/50) )
x = RegularSpacing(0.0, 1., 365)
fx = f(x, .0001)
y = rand(fx)

f_naive_approx = GP( 10*transform(ApproxPeriodicKernel(r = .5), 1/50) )
f_approx = to_sde(f_naive_approx, SArrayStorage(Float64))
f_post_approx = posterior(f_approx(x, .0001), y)
y_post_approx = rand(f_post_approx(x, .0001))

plot(y)
plot!(y_post)

However, I ran into an issue with Zygote again, when calculating the gradient:

function objective(θ)
    f_naive = GP( 10*transform(ApproxPeriodicKernel(r = .5), 1/50) )
    f = to_sde(f_naive, SArrayStorage(Float64))
    return -logpdf(f(x, θ), y)
end
Zygote.gradient(objective, .1)

Need an adjoint for constructor SVector{2, Float64}. Gradient is of type Vector{Float64}

The text was updated successfully, but these errors were encountered:

andreaskoher · 2021-04-12T14:18:20Z

Alternatively to the above, the approach below is somewhat simpler as it directly constructs a SimpleKernel following Arno's implementation in GPStuff.

struct ApproxPeriodicKernel{J,T} <: KernelFunctions.SimpleKernel
    l::T # length scale as defined in Arno's text book 
end

# assuming length( k.r ) == 1
ApproxPeriodicKernel(k::PeriodicKernel{T}; degree_of_approx = 3) where {T} = ApproxPeriodicKernel{degree_of_approx,T}( 2*only(k.r) )
ApproxPeriodicKernel(; r=1/2, degree_of_approx = 3 ) = ApproxPeriodicKernel(PeriodicKernel(; r=[r]); degree_of_approx = degree_of_approx)
# k = ApproxPeriodicKernel()

function to_sde(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
    D  = 2*(J+1)
    w0 = 2π
    F  = kron( Diagonal(0:J), [0 -w0; w0 0])
    H  = kron( ones(1, J+1),[1 0])
    Fs = SMatrix{D,D,T}(F)
    q  = T(0)
    Hs = SVector{D,T}(H)#SMatrix{1,D,T}(H)
    return Fs, q, Hs
end

function stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
    D = 2*(J+1)
    q = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]
    P = kron(Diagonal(q), Diagonal([1, 1]))
    m = SVector{D, T}( zeros(T,D) )
    Ps = SMatrix{D,D,T}(P)
    return Gaussian(m, Ps)
end

willtebbutt · 2021-04-12T17:35:08Z

Sorry for the slow response.

I think this second approach is probably better for now -- I'm not sure how useful the oscillator kernel is likely to be outside of the context of approximating this stuff.

willtebbutt · 2021-04-12T17:36:20Z

Regarding the Zygote + StaticArrays problems -- this is something that needs fixing.

I'll have a hack at a general solution this evening.

willtebbutt · 2021-04-12T20:21:45Z

Actually, looks like it'll have to be tomorrow unfortunately.

andreaskoher · 2021-04-12T21:46:36Z

no worries. I completely agree with the second approach

willtebbutt · 2021-04-20T18:42:23Z

@andreaskoher how's it going with this? Is there anything blocking it that I could assist with?

andreaskoher · 2021-04-22T15:54:15Z

thanks for the reminder :)
With the second definition of ApproxPeriodicKernel, I ran into the following problem with Zygote as mentioned in #65:

const x = RegularSpacing(0., 0.01, 1000)
const y = rand( to_sde( GP( ApproxPeriodicKernel() ) )(x) )
function objective(θ)
    k = θ * ApproxPeriodicKernel()
    f = to_sde( GP(k), SArrayStorage(Float64))
    -logpdf(f(x), y)
end
Zygote.gradient(objective, 1.)

no method matching iterate(::Nothing)

It references the list comprehension in stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}):

q = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]

I can fix the problem by iteratively applying vcat in order to construct q as follows:

function TemporalGPs.stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
    D = 2*(J+1)
    q = [besseli(0, k.l^-2)/exp(k.l^-2)]
    for j in 1:J
        q  = vcat(q, [2*besseli(j, 1/k.l^-2)/exp(k.l^-2)])
    end
    # q = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]
    P = kron(Diagonal(q), Diagonal([1, 1]))
    m = SVector{D, T}( zeros(T,D) )
    Ps = SMatrix{D,D,T}(P)
    return TemporalGPs.Gaussian(m, Ps)
end

It looks somewhat inefficient to me but, I am also no expert on good coding style. In any case, with this fix, I can correctly infer the parameters:

# Declare model parameters using `ParameterHandling.jl` types.
Random.seed!(1234)

params = (
    s = positive(10.),
    l = positive(1/20),
    v  = positive(.01),
)

function build_gp(ps)
    k = ps.s* ApproxPeriodicKernel() ∘ ScaleTransform(ps.l)
    return to_sde( GP(k), SArrayStorage(Float64))
end

# Generate some synthetic data from the prior.
x = RegularSpacing(0.0, 0.01, 10_000)
y = rand(build_gp(value(params))(x, value(params.v)))

# Construct mapping between structured and Vector representation of parameters.
flat_initial_params, unflatten = flatten(params)

# Specify an objective function for Optim to minimise in terms of x and y.
function objective(flat_params)
    params = value(unflatten(flat_params))
    f = build_gp(params)
    return -logpdf(f(x, params.v), y)
end

# Optimise using Optim.
training_results = Optim.optimize(
    objective,
    θ -> only(Zygote.gradient(objective, θ)),
    flat_initial_params+rand(4), # Add some noise to make learning non-trivial
    Optim.LBFGS(
        alphaguess = Optim.LineSearches.InitialStatic(scaled=true),
        linesearch = Optim.LineSearches.BackTracking(),
    ),
    Optim.Options(show_trace = true);
    inplace=false,
)

# Extracting the final values of the parameters.
final_params = value(unflatten(training_results.minimizer))

# compare data to inference
f = build_gp(value(final_params))
fx = posterior(f(x, final_params.v),y)

plot(x,y)
plot!(x, fx(x, final_params.v))

The problem remains that I can still not infer the length scale r of the periodic kernel. Maybe the problem is related to the Zygote issue with besseli that you opened:

gradient(l->besseli(0, l^-2), .1)

not implemented

As a better workaround than using vcat to construct the vector q as above, we could define a function

getq(l,J) = [ j>0 ? 2*besseli(j, k.l^-2)/exp(k.l^-2) : besseli(0, k.l^-2)/exp(k.l^-2) for j in 0:J]

and implement the analytically derived adjoints as in Arnos implementation in GPStuff. What do you think?

willtebbutt · 2021-04-27T19:29:48Z

The problem remains that I can still not infer the length scale r of the periodic kernel. Maybe the problem is related to the Zygote issue with besseli that you opened:

Ahh okay -- this sounds like the kind of problem that I'd expect. We should be getting a fix for this soon.

and implement the analytically derived adjoints as in Arnos implementation in GPStuff. What do you think?

I think this would be an excellent idea for now! Are you familiar with doing this using ChainRules? Should just require an @scalar_rule if I'm not mistaken.

edit: also, sorry for the slow response time!

andreaskoher · 2021-04-28T15:27:39Z

no rush :)

I added the following analytical derivatives:

using ChainRulesCore
q0(l) = besseli(0, l^-2)/exp(l^-2)
δq0_δl(l) = l^-3/exp(l^-2)*(
         2*besseli(0,l^-2) -
         2*besseli(1,l^-2)
)
@scalar_rule(q0(x), δq0_δl(x))

qj(l,j) = 2*exp(-l^-2)*besseli(j,1/l^2)
δqj_δl(l,j) = 
        4besseli(j, 1/l^2) / exp( l^-2 ) / l^3 - 
        2(
            besseli(j-1, 1/l^2) + 
            besseli(j+1, 1/l^2)
        ) / exp( l^-2 ) / l^3
          
@scalar_rule(qj(x,j), (δqj_δl(x,j), DoesNotExist()))

and checked the gradients against ForwardDiff and FiniteDifferences. At this level everything seems fine, however, the following gradient with respect to the length scale r still fails:

const x = RegularSpacing(0., 0.01, 1000)
const y = rand( to_sde( GP( ApproxPeriodicKernel() ) )(x) )
function objective(θ)
    k = ApproxPeriodicKernel( r=θ )
    f = to_sde( GP(k), SArrayStorage(Float64))
    -logpdf(f(x), y)
end
Zygote.gradient(objective, 1.)

no method matching iterate(::Nothing)

As before, it references the same list comprehension in

function TemporalGPs.stationary_distribution(k::ApproxPeriodicKernel{J,T}, s::SArrayStorage{T}) where {J,T<:Real}
    D = 2*(J+1)
    q = [ j>0 ? qj(k.l, j) : q0(k.l) for j in 0:J]
    P = kron(Diagonal(q), Diagonal([1, 1]))
    m = SVector{D, T}( zeros(T,D) )
    Ps = SMatrix{D,D,T}(P)
    return TemporalGPs.Gaussian(m, Ps)
end

I will try and dig into the issue these days

anhi · 2022-06-15T08:21:48Z

Hey there,

I was wondering if there is any progress on this issue (and #64). I tried using the above approach with current versions of all involved libraries (changing DoesNotExist() to NoTangent()), but am running into

ERROR: MethodError: no method matching size(::TemporalGPs.LGSSM{TemporalGPs.GaussMarkovModel{TemporalGPs.Forward, FillArrays.Fill{StaticArrays.SMatrix{8, 8, Float64, 64}, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{FillArrays.Zeros{Float64, 1, Tuple{Base.OneTo{Int64}}}, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{StaticArrays.SMatrix{8, 8, Float64, 64}, 1, Tuple{Base.OneTo{Int64}}}, TemporalGPs.Gaussian{StaticArrays.SVector{8, Float64}, StaticArrays.SMatrix{8, 8, Float64, 64}}}, StructArrays.StructVector{TemporalGPs.ScalarOutputLGC{LinearAlgebra.Adjoint{Float64, StaticArrays.SVector{8, Float64}}, Float64, Float64}, NamedTuple{(:A, :a, :Q), Tuple{FillArrays.Fill{LinearAlgebra.Adjoint{Float64, StaticArrays.SVector{8, Float64}}, 1, Tuple{Base.OneTo{Int64}}}, FillArrays.Fill{Float64, 1, Tuple{Base.OneTo{Int64}}}, Vector{Float64}}}, Int64}})```

willtebbutt · 2022-06-15T10:43:42Z

Hiya. Unfortunately I've not had time to work on this in the recent past, and I'm not entirely sure when I'm going to have the time in the near future.

theogf · 2023-03-15T14:39:00Z

I am planning to work on this issue once #90 is finished

This was referenced Apr 13, 2021

Add generic SArray adjoint #65

Merged

besseli (and friends?) FluxML/Zygote.jl#945

Closed

theogf self-assigned this Mar 15, 2023

theogf mentioned this issue Mar 21, 2023

Add ApproximatePeriodicKernel #98

Merged

1 task

theogf closed this as completed May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Periodic kernel #61

Periodic kernel #61

andreaskoher commented Apr 9, 2021

andreaskoher commented Apr 12, 2021 •

edited

Loading

willtebbutt commented Apr 12, 2021

willtebbutt commented Apr 12, 2021

willtebbutt commented Apr 12, 2021

andreaskoher commented Apr 12, 2021

willtebbutt commented Apr 20, 2021

andreaskoher commented Apr 22, 2021

willtebbutt commented Apr 27, 2021 •

edited

Loading

andreaskoher commented Apr 28, 2021

anhi commented Jun 15, 2022

willtebbutt commented Jun 15, 2022

theogf commented Mar 15, 2023

Periodic kernel #61

Periodic kernel #61

Comments

andreaskoher commented Apr 9, 2021

andreaskoher commented Apr 12, 2021 • edited Loading

willtebbutt commented Apr 12, 2021

willtebbutt commented Apr 12, 2021

willtebbutt commented Apr 12, 2021

andreaskoher commented Apr 12, 2021

willtebbutt commented Apr 20, 2021

andreaskoher commented Apr 22, 2021

willtebbutt commented Apr 27, 2021 • edited Loading

andreaskoher commented Apr 28, 2021

anhi commented Jun 15, 2022

willtebbutt commented Jun 15, 2022

theogf commented Mar 15, 2023

andreaskoher commented Apr 12, 2021 •

edited

Loading

willtebbutt commented Apr 27, 2021 •

edited

Loading