Running a GPU 24/7 for a model that gets 10 requests per hour is like keeping a helicopter running on the helipad all day because you might need it. Land it (stop the instance), and call it when you need it.