Related Posts
What health insurance bkd has? Any 401k match?
More Posts
Is 16LPA low for 5Exp. Devops background
37 / 13 years / 215K / PDX
Additional Posts in Data & Analytics Consultants
New to Fishbowl?
Download the Fishbowl app to
unlock all discussions on Fishbowl.
unlock all discussions on Fishbowl.





Aren’t screening code tests like this not supposed to be crowdsourced…? Good luck on your interview.
Lol
if fraud then reject else post
You’re not going to have a practical solution to this so I am assuming this is an interview question lol.
But to throw you a bone, your best bet would be to have a query pull out statistical measures from the dataset to identify points as outliers in the data and flag them for manual investigation. Things like the standard deviation of the population and the z-score of each transaction.
For example, I would want to see things like outliers in distance of purchases from the users home zip code, or purchase price.
I would try to segment the data based on vendor type (such as restaurants, rent, entertainment, utilities, etc.) and identify outliers within the subgroups.
Essentially, I have to come up with "rules" to test if a transaction is potentially fraudulent, while making sure I don't reject legitimate transactions
https://www.w3schools.com/sql/sql_case.asp
Coach
Seems like a weird question off the top, but I actually like it because it's essentially asking you to figure out a classification problem (with a specific threshold value) without using ML, thus testing your problem-solving and stats skills rather than your ability to use scikit-learn.
If you have a list of entries with fraud/not fraud flags and deposit amounts I would just generate a big sequence (can do this in presto and I assume other dialects of sql).
Cartesian join that to the table and start summing count(fraud)/count(transactions) over the table.
Basically it sounds like they want you to create a ROC curve, but I'm assuming you don't have to implement an actual roc curve. Or maybe you do. Would have to see the data I guess 🤔
If they want you to create the fraud flags I dunno. Kinda silly question without some ground truth. But to the point above creating some sort of statistics around what a fraudulent transaction looks like then implementing it is the way to go here.