Solve Interface

CommonSolve.solveFunction
solve(problem::AbstractIntervalMDPProblem, alg::RobustValueIteration; callback=nothing)

Solve minimizes/maximizes optimistic/pessimistic specification problems using value iteration for interval Markov processes.

It is possible to provide a callback function that will be called at each iteration with the current value function and iteration count. The callback function should have the signature callback(V::AbstractArray, k::Int).

solve can be called without specifying the algorithm, in which case it defaults to RobustValueIteration.

Examples

using IntervalMDP

prob1 = IntervalAmbiguitySets(;
    lower = [
        0.0 0.5
        0.1 0.3
        0.2 0.1
    ],
    upper = [
        0.5 0.7
        0.6 0.5
        0.7 0.3
    ],
)

prob2 = IntervalAmbiguitySets(;
    lower = [
        0.1 0.2
        0.2 0.3
        0.3 0.4
    ],
    upper = [
        0.6 0.6
        0.5 0.5
        0.4 0.4
    ],
)

prob3 = IntervalAmbiguitySets(;
    lower = [
        0.0 0.0
        0.0 0.0
        1.0 1.0
    ],
    upper = [
        0.0 0.0
        0.0 0.0
        1.0 1.0
    ]
)

transition_probs = [prob1, prob2, prob3]
initial_state = [1]
mdp = IntervalMarkovDecisionProcess(transition_probs, initial_state)

# output

FactoredRobustMarkovDecisionProcess
├─ 1 state variables with cardinality: (3,)
├─ 1 action variables with cardinality: (2,)
├─ Initial states: [1]
├─ Transition marginals:
│  └─ Marginal 1:
│     ├─ Conditional variables: states = (1,), actions = (1,)
│     └─ Ambiguity set type: Interval (dense, Matrix{Float64})
└─Inferred properties
   ├─Model type: Interval MDP
   ├─Number of states: 3
   ├─Number of actions: 2
   ├─Default model checking algorithm: Robust Value Iteration
   └─Default Bellman operator algorithm: O-Maximization
reach_states = [3]
time_horizon = 10
prop = FiniteTimeReachability(reach_states, time_horizon)
spec = Specification(prop, Pessimistic, Maximize)

# output

Specification
├─ Satisfaction mode: Pessimistic
├─ Strategy mode: Maximize
└─ Property: FiniteTimeReachability
   ├─ Time horizon: 10
   └─ Reach states: CartesianIndex{1}[CartesianIndex(3,)]
# Verification
problem = VerificationProblem(mdp, spec)
sol = solve(problem, RobustValueIteration(default_bellman_algorithm(mdp)); callback = (V, k) -> println("Iteration ", k))
V, k, res = sol  # or `value_function(sol), num_iterations(sol), residual(sol)`

# output

Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
IntervalMDP.VerificationSolution{Float64, Vector{Float64}, Nothing}([0.9597716063999999, 0.9710050144, 1.0], [0.01593864639999998, 0.011487926399999848, -0.0], 10, nothing)
# Control synthesis
problem = ControlSynthesisProblem(mdp, spec)
sol = solve(problem, RobustValueIteration(default_bellman_algorithm(mdp)); callback = (V, k) -> println("Iteration ", k))
σ, V, k, res = sol # or `strategy(sol), value_function(sol), num_iterations(sol), residual(sol)`

# output

Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Iteration 8
Iteration 9
Iteration 10
IntervalMDP.ControlSynthesisSolution{TimeVaryingStrategy{1, Vector{Tuple{Int32}}}, Float64, Vector{Float64}, Nothing}(TimeVaryingStrategy{1, Vector{Tuple{Int32}}}(Vector{Tuple{Int32}}[[(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)], [(1,), (2,), (1,)]]), [0.9597716063999999, 0.9710050144, 1.0], [0.01593864639999998, 0.011487926399999848, -0.0], 10, nothing)
source
IntervalMDP.residualFunction
residual(s::VerificationSolution)

Return the residual of a verification solution.

source
residual(s::ControlSynthesisSolution)

Return the residual of a control synthesis solution.

source
IntervalMDP.num_iterationsFunction
num_iterations(s::VerificationSolution)

Return the number of iterations of a verification solution.

source
num_iterations(s::ControlSynthesisSolution)

Return the number of iterations of a control synthesis solution.

source
IntervalMDP.value_functionFunction
value_function(s::VerificationSolution)

Return the value function of a verification solution.

source
value_function(s::ControlSynthesisSolution)

Return the value function of a control synthesis solution.

source
IntervalMDP.TimeVaryingStrategyType
TimeVaryingStrategy

A time-varying strategy is a strategy that may vary over time. Since we need to store the strategy for each time step, the strategy is finite, and thus only applies to finite time specifications, of the same length as the strategy.

source

VI-like Algorithms

IntervalMDP.RobustValueIterationType
RobustValueIteration

A robust value iteration algorithm for solving interval Markov decision processes (IMDPs) with interval ambiguity sets. This algorithm is designed to handle both finite and infinite time specifications, optimizing for either the maximum or minimum expected value based on the given specification.

source