I have a problem I've been sitting on for a while. I deployed a flask app using Web Apps, it includes a website and a prediction API. When the website is first accessed, files necessary for prediction are loaded into memory.I have the 'Always on' function in use, but as I understand it it only pings the root site and the API itself is on .../predict, so it seems unaffected.

If the API is idle for a while, it takes a long time to respond when it's called for the first time, and then it's fast again. After some investigation it seems it gets slowed down in the parts that need the files I loaded into memory earlier, which leads me to believe those files are offloaded somewhere and it takes a lot longer to load them back up than it would've been to load them again from scratch. (I do not want to load them every time as it would slow down the response a lot.)


Can someone confirm my suspicion or point me to some resources that would explain why this is happening?



