Related Posts
Additional Posts in Data & Analytics Consultants
today I choose violence

Thought this was interesting. Across 160 teams of researchers, just about all failed to make good life outcome predictions on things like GPA, evictions, layoffs, and others. Data followed 4.5k families across 15 years, with 13k features (varied over time). Haven't looked at it directly yet, but will be turning the docs and data inside out... In the meantime, authors claim this as showing the limits of ML. Oh, and it's published in PNAS, so you know there's some big publication energy there.
https://www.pnas.org/content/117/15/8398
New to Fishbowl?
unlock all discussions on Fishbowl.





Statistics & Probability
1. Explain the bias–variance tradeoff.
2. What assumptions underlie linear regression? What happens if they’re violated?
3. How do you interpret a p-value? What are common misconceptions?
4. Describe Type I vs. Type II error. How would you reduce each?
5. When would you use a Bayesian approach over a frequentist one?
6. Explain confidence intervals vs. credible intervals.
7. How would you design and analyze an A/B test?
8. What are the risks of peeking at experiment results early?
9. How do you handle multiple hypothesis testing?
10. Explain Simpson’s Paradox with an example.
⸻
Machine Learning
11. Compare random forests and gradient boosting.
12. How does regularization work (L1 vs L2)?
13. When would you use logistic regression over a tree-based model?
14. Explain how cross-validation works and when it might fail.
15. What causes overfitting? How do you detect and prevent it?
16. How would you handle imbalanced data?
17. Explain feature importance methods (e.g., SHAP, permutation importance).
18. Describe how you would build a recommendation system.
19. How do you evaluate model performance when business costs are asymmetric?
20. What is concept drift, and how would you monitor for it in production?
⸻
Product & Business Sense
21. How would you measure success for a new feature in a social media product?
22. If user engagement drops 10% week-over-week, how would you investigate?
23. How would you decide whether to ship a model to production?
24. How do you translate a vague business problem into a data science project?
25. What metrics would you use to evaluate a search ranking algorithm?
26. How would you quantify the impact of latency on user retention?
27. Design an experiment to test a pricing change.
⸻
SQL & Data Engineering
28. Write a query to compute 7-day rolling active users.
29. How would you detect data leakage in a dataset?
30. Describe how you would design a data pipeline for training and serving models.
⸻
Behavioral & Communication
31. Tell me about a time your model failed in production.
32. Describe a situation where stakeholders disagreed with your analysis.
33. How do you explain complex results to non-technical audiences?
34. Tell me about a time you had to make a decision with incomplete data.
I totally understand technical interviews for data science roles can be tricky, and having a solid question bank can make a huge difference. A few strategies:
Leverage online resources: LeetCode, Kaggle, and Glassdoor often have real interview questions for data science roles.
Review fundamentals: Statistics, probability, SQL, Python/R, machine learning concepts, and data visualization are commonly tested.
Mock interviews: Practicing with peers or platforms like Pramp or Interviewing.io helps simulate real interview pressure.
Project-based prep: Be ready to discuss your portfolio projects in depth recruiters often test applied knowledge.
If anyone wants guidance on preparing strategically for data science interviews or building a question prep plan, feel free to reach out to remi.executiverecuiter.lhh.gmail.com he recently helped my brother land the role he wanted.