Related Posts
Anyone a MH housing case manager? Let's chat.
Additional Posts in Data & Analytics Consultants
New to Fishbowl?
Download the Fishbowl app to
unlock all discussions on Fishbowl.
unlock all discussions on Fishbowl.
Anyone a MH housing case manager? Let's chat.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Download the Fishbowl app to unlock all discussions on Fishbowl.
Copy and paste embed code on your site
Send download link to your phone
OR
Scan your QR code to download
Fishbowl app on your mobile
Bowl Leader
I have a strong opinion about auto ML in general. It's good for iterations on very similar models to ones you've already built. I'm talking swap one variable without process consequences similar. Everything else, shaky.
The problem is you often discover important things about your data only while modeling. Standard data cleanup usually doesn't go deep enough to find weird artifacts of the data generating process. Auto ML on the other hand is a model selection and tuning tool. If you feed it data that's off, it'll still run, but the validation metrics and decisions it makes will become shaky to very wrong. This is why we run pilots with all models, actually. However, slapping auto ML on everything reduce time put into contextual data cleanup. This is a recipe for subpar or failed results in pilots.
There are situations where that might not matter. It's an exploration vs production thing. If you're in production, by all means, let the machine handle it.* Ten of twenty similar models working ok can be better than three of four working slightly better. If on the other hand you're exploring new problems, you really want that first model to uncover all it can. It'll up the performance of your later process, and maximize the chance of the project not being falsely deemed a failure early on.
*This doesn't mean put your production system in freefall. High failure cost projects should still have controls.
Thank you for taking the time give this thorough input 🙏 I 100% agree with all of this.
Coach
I used the trial version when it was first released. The big bonus is automated feature engineering.
I'm going to get flamed for this but feature engineering is boring and formulaic. So is model fitting. Partially automating those steps makes a significant amount of sense to me.
You can then spend most of your time gathering and cleaning data or understanding the problem. As long as driverless is a good fit for your use case you can then let it take over feature engineering and model fitting.
However when I evaluated it it rarely was a good fit for my use cases.