Related Posts
How is the project for 'Tyson' client?
Anyone have experience working CWX side of Facebook (Meta) ? These are the full time contract roles that potentially turn into permanent roles directly with Facebook (Meta). Had a recruiter reach out and offering me comparable TC and such, just curious if anyone has experience in these roles and success/failure of transitioning into permanent role. Thanks in advance!
More Posts
When you fake it, till you make it.

Any insight to PwC tuition reimbursement?
Additional Posts in Data & Analytics Consultants
New to Fishbowl?
unlock all discussions on Fishbowl.




So if you know Scala and/or Spark, PySpark will be comically easy to learn. And you should learn it since you’ll have to work with data scientists, who almost exclusively code in Python/PySpark.
The learning curve wouldn’t be huge. Especially if you know what you are trying to code. There are enough documentations out there that provide you with the equivalent code in sparkSQL, pyspark and Scala
Scala is a huge asset to know. PySpark is great for working with other people. PySpark engineering will pay X. Scala engineering will pay X+20-50% because there are fewer people that know it. I got raked over the coals finding a Scala engineer to support a legacy process until it could be converted over. So, in an ideal world, learn PySpark and learn to convert your code from Scala to PySpark - and you'll make bank.
As a heads up most folks are coming to PySpark from Python on more the software engineering or desktop ML learning end... There are major difference between cloud ETL and those other topics.
Everything Director 1 said is correct.
Yeah as S1 said, you should learn it and it should not be that difficult. Python is also used stuff like ML and lambda functions so it’s quite useful and will help you a ton going forward
Try Spotify
How did you got into that position? I am looking to change or go into Data Engineering
As for me, by accident. One day I was told to fix a stored procedure doing ETL work, then had to move that logic to SSIS (picked it up on the fly), then was asked to do a pilot on spark to see performance improvements.
At the end, somehow people called me a data engineering expert. I was a web developer.
Wouldn’t worry too much about learning pyspark before applying since knowledge of any spark api should be enough to get in the door. there’s something like 10 languages that can interact with spark so the core of it is that you understand spark
+1 on PySpark and adding to that,a lot of the Scala advantages don’t matter in the Databricks notebook environment as an example . Notebooks don’t support features of IDEs or production grade code packages, so if the intent is to strictly work with notebooks, don’t expect to many benefits from Scala’s advantages