The actual action frames get higher weights 🏋️♂️, and neighboring frames get decaying weights.
-
My model performance is actually really bad. My guess is that I did not collect enough games/episodes for the training.
-
Each episode contains around 900 frames, and not all the frames will be into the dataset depending on the selected weights.
-
If you would like to reproduce the model for your own card selection, you have to re-train the policy network from scratch as the card information will be different, I have not fine tunned yet.
-
As you can notice that this model only validate for a specific card set, it is not valid for the all the cards. Maybe training a policy network for all the card would be interesting but the dataset size it require will be very very expensive.
-
Treat training dataset is a way to represent the knowledge domain where this domain is somehow crazily high dimensional. Given this, we have to provide the data to the model that what card I select and the position I place this card for all possible siuations. The dataset will be intense so that it make the knowledge domain being smooth enough for model learning. (Emmm this is just my personal/shadow thinking).
-
This is offline learning, I can not access the live interaction with the game internal so the tradional RL can not be applied. If the game interanl can be accessed, then we do not need any visual detection to extract information and RL algorithm and improve the model though interacting within games.