Component | Description |
---|---|
Actor network | |
    Layers | Fully connected PI layer |
    Input | Observations |
    Output | Control actions |
Critic network | |
    State path layers | Fully connected layer with 32 neurons |
    Action path layers | Fully connected layer with 32 neurons |
    Input | Observations and control action |
    Output | Q-values |
Hyperparameters | |
    Actor learning rate | 1e−3 |
    Critic learning rate | 1e−3 |
    Gradient threshold | 1 |
    Mini-batch size | 128 |
    Experience buffer length | 1e6 |
    Exploration model | Gaussian noise with a variance of 0.1 |
DC motor parameters | |
    L | 0.05 |
    R | 1 |
    \(K_m\) | 0.05 |
    \(K_v\) | 0.05 |
    J | 1e−5 |
    b | 1e−2 |