Related Posts
The king 👑

Is there anyone who is active here ?
Additional Posts in Data & Analytics Consultants
New to Fishbowl?
Download the Fishbowl app to
unlock all discussions on Fishbowl.
unlock all discussions on Fishbowl.
The king 👑

Is there anyone who is active here ?
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Download the Fishbowl app to unlock all discussions on Fishbowl.
Copy and paste embed code on your site

Scan your QR code to download
Fishbowl app on your mobile

I love R but it’s definitely not widespread.
Yeah, I'd say python is a much better language than R, but the tidyverse beats anything Python has to offer.
Pick the tool for the job. Scraping data or interacting with web apis? I reach for python. But cleaning/summarizing messy data and then visualizing it? Tidyverse time.
Or SAS. I really like cleaning data in SAS, and proc tabulate is great for making quick tables that are in a ready to use format.
I use R on several projects. I prefer it over Python for any sort of data wrangling or statistical modeling. Plus Rstudio is hands down the best data science IDE in my opinion
R is a disease spread by academics who venture into industry
You’re not wrong! Lol
If you want to do serious stats including a ton of *supervised* ML, R has the much better codebase.
If you're just running a shitton of logits and calling it "AI", Python is fine. But realize you are overselling a mathematical construct that is 50yrs old and then the value of your analytics is in the data quality, not the mathematical function or your code.
What are the Python equivalents of
- mice for missing data
- grf / causal forests
- parsnip + broader tidymodels (in particular, glmnet if I want to run penalized regression models like lasso)
- tmle
- simstudy
- stan for Bayesian
- lme4 for hierarchical
- forecast for time series including hypothesis testing and diagnostics
... and how many lines of code would I need to do the Python equivalent in each one + Viz/dashboard with ggplot2/RShiny?
There is a reason most bioscience PhDs and research groups use R rather than Python, although of course there is just some cultural stickiness too
Have used it in industry; more likely when it comes to doing work that’s analytic in nature only, and when you’re more likely to not automate things, and when it’s in a more “academic”/publish-potential environment.
Also - you can’t compare Tableau to R lol both have different primary purposes
I use R everyday. I think either can be used for everyday work. They're comparable in ML functionality with python having stronger deep learning frameworks and R having stronger traditional statistical modeling packages.
Mentor
Like S1 said - it’s used more for modeling or where there is stats or analysis involved.
My econ consulting firm uses R almost exclusively.
drop the firm name!
Most common language among our team!
Yup R user here. Pretty consistent. Also love ggplot and plotly
Always just use python
In my limited experience implementing R in the cloud, is that: I have an easier time writing pySpark and scaling than I do than parallelizing in R. I haven't done enough of R to say that I was doing it optimally; however, I can say what I can do with lambda functions in pySpark and/or SparkSQL that there is so much more that I can do quickly. Moreover, more coworkers know pySpark, meaning that my code is more reusable / easier to pipeline / parameterize / version control than when I write something in RStudio.
The other part is that when you are crunching too much data to sit on a desktop and you are really really having to transform your code (i.e. terabytes worth of data that you are going to have to join), RStudio is not what I reach for.
For anyone that is a databricks nerd...D15s v2 x 20 nodes with a D32s v3 driver is a great setup for crunching boodles of data... comparing parallelized RStudio is like comparing apples and oranges.
Coach
@k1 it's just cluster size e-peen =]
Should probably size the cluster to the workload vs random suggestions.
Also RStudio doesn't parallelize. It's an IDE. PS1 is clueless.
Coach
I use R constantly. I hate python.
I use R because thats what I learned and most of what I do is statistical analysis. I have been meaning to learn python but havent gotten to it nor needed it.
Also my staff knows python.
In my current job, Python is widely used in my current department alongside with SQL and Tableau. I personally love R because it was my first programming language if you don’t count SQL (a sub-programmatic language). R is similar to Excel and SQL so it’s quite to easy to understand.
Regardless of the Python vs R argument, both of them are simply tools to get the job done.
My firm uses it for supply chain forecasting work
Built an ML platform for our DS team in R. AMA.
We use R, SAS, and Stata for very different purposes. R is great for mapping
Also the true hardcore data scientists (of which I am not one) who are inventing the shit that the best tech companies are using and which we might pick up in 10 years' time are not in either Python or R.
They're using Julia.
Coach
I don't think this is correct =]
I wish we would move to Julia. I ❤️ Julia, but to my knowledge it isn't used by the ranking teams (maybe someone knows better) and we are heavily invested in pytorch/prophet.
I was under the impression Google was using go or swift for the next version of tensorflow(?).
The R/python debate is silly. For most of the stuff people use the high level languages are just interfaces to fortran/c/c++. If the language has the tools and suits your workflow it shouldn't matter.