Related Posts
How do you all deal with protein farts?
Additional Posts in Data & Analytics Consultants
Thought this was interesting. Across 160 teams of researchers, just about all failed to make good life outcome predictions on things like GPA, evictions, layoffs, and others. Data followed 4.5k families across 15 years, with 13k features (varied over time). Haven't looked at it directly yet, but will be turning the docs and data inside out... In the meantime, authors claim this as showing the limits of ML. Oh, and it's published in PNAS, so you know there's some big publication energy there.
https://www.pnas.org/content/117/15/8398
Has anyone else begun to resent data science?
New to Fishbowl?
unlock all discussions on Fishbowl.




They are interested to know what hyperscaler (AWS, Azure, etc) and any associated technologies for different components like ingestion, processing/pipeline, consumption and other tools like Snowflake, Databricks, etc.
You need to know more information. If you don’t and you cannot get access to that information you would say “well, the best depends on your needs. Let’s pretend you don’t have any supposing tech stack for your cloud based data platform and work from there. Here are the considerations I would need to account for-“
1. Cloud Infrastructure: Platforms like Amazon Web Services (AWS), Microsoft Azure, or Google Cloud provide the underlying infrastructure to host and manage the platform.
2. Storage: Cloud-based storage solutions such as Amazon S3, Azure Blob Storage, or Google Cloud Storage are used to store large volumes of data.
3. Data Processing: Technologies like Apache Spark or Apache Hadoop enable distributed processing of data for tasks like data transformation, analysis, and machine learning.
4. Data Warehousing: Data warehousing solutions like Amazon Redshift, Google BigQuery, or Snowflake are used to organize and analyze structured data efficiently.
5. Data Integration: Tools like Apache Kafka or AWS Glue help in data ingestion, streaming, and integration from various sources into the platform.
6. Database Management: Databases such as Amazon RDS, Azure SQL Database, or Google Cloud SQL provide the ability to manage structured data efficiently.
7. Data Visualization: Platforms like Tableau, Power BI, or Looker enable the creation of interactive dashboards and visualizations for data analysis and reporting.
8. Security and Identity Management: Services like AWS IAM, Azure Active Directory, or Google Cloud Identity and Access Management ensure secure access and data protection.
9. Monitoring and Logging: Tools like AWS CloudWatch, Azure Monitor, or Google Cloud Monitoring provide insights into system performance, resource utilization, and error tracking.
10. APIs and Integration: RESTful APIs or GraphQL can be used to expose data and functionality to other applications or services.