vi_solver
SolverHistory
Class to represent the solving history of a solver. The purpose of this class is to allow plotting of the solution and plotting the evolution of the value function over the training process. This class is not meant to be instanciated manually, it meant to be used when returned by the solve() method of a Solver object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tracking_level
|
int
|
The tracking level of the solver. |
required |
model
|
Model
|
The model that has been solved by the Solver. |
required |
gamma
|
float
|
The gamma parameter used by the solver (learning rate). |
required |
eps
|
float
|
The epsilon parameter used by the solver (covergence bound). |
required |
initial_value_function
|
ValueFunction
|
The initial value function the solver will use to start the solving process. |
None
|
Attributes:
Name | Type | Description |
---|---|---|
tracking_level |
int
|
|
model |
Model
|
|
gamma |
float
|
|
eps |
float
|
|
run_ts |
datetime
|
The time at which the SolverHistory object was instantiated which is assumed to be the start of the solving run. |
iteration_times |
list[float]
|
A list of recorded iteration times. |
value_function_changes |
list[float]
|
A list of recorded value function changes (the maximum changed value between 2 value functions). |
value_functions |
list[ValueFunction]
|
A list of recorded value functions. |
solution |
ValueFunction
|
|
summary |
str
|
|
Source code in olfactory_navigation/agents/model_based_util/vi_solver.py
17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 |
|
solution
property
The last value function of the solving process.
summary
property
A summary as a string of the information recorded.
add(iteration_time, value_function_change, value_function)
Function to add a step in the simulation history.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
iteration_time
|
float
|
The time it took to run the iteration. |
required |
value_function_change
|
float
|
The change between the value function of this iteration and of the previous iteration. |
required |
value_function
|
ValueFunction
|
The value function resulting after a step of the solving process. |
required |
Source code in olfactory_navigation/agents/model_based_util/vi_solver.py
85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 |
|
plot_changes()
Function to plot the value function changes over the solving process.
Source code in olfactory_navigation/agents/model_based_util/vi_solver.py
125 126 127 128 129 130 131 132 133 134 135 |
|
solve(model, horizon=100, initial_value_function=None, gamma=0.99, eps=1e-06, use_gpu=False, history_tracking_level=1, print_progress=True)
Function to solve an MDP model using Value Iteration. If an initial value function is not provided, the value function will be initiated with the expected rewards.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Model
|
The model on which to run value iteration. |
required |
horizon
|
int
|
How many iterations to run the value iteration solver for. |
100
|
initial_value_function
|
ValueFunction
|
An optional initial value function to kick-start the value iteration process. |
None
|
gamma
|
float
|
The discount factor to value immediate rewards more than long term rewards. The learning rate is 1/gamma. |
0.99
|
eps
|
float
|
The smallest allowed changed for the value function. Bellow the amound of change, the value function is considered converged and the value iteration process will end early. |
1e-6
|
use_gpu
|
bool
|
Whether to use the GPU with cupy array to accelerate solving. |
False
|
history_tracking_level
|
int
|
How thorough the tracking of the solving process should be. (0: Nothing; 1: Times and sizes of belief sets and value function; 2: The actual value functions and beliefs sets) |
1
|
print_progress
|
bool
|
Whether or not to print out the progress of the value iteration process. |
True
|
Returns:
Name | Type | Description |
---|---|---|
value_function |
ValueFunction
|
The resulting value function solution to the model. |
history |
SolverHistory
|
The tracking of the solution over time. |
Source code in olfactory_navigation/agents/model_based_util/vi_solver.py
138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 |
|