What is a data lake in basic terms?

like
Posting as :
works at
You are currently posting as works at

A storage location containing files in various formats and structures of data on cheap hardware that you put schema on read software to scan and report on the data.

like

Data stored in formats like xml, json, csv, images formats, proprietary formats as well.
Any data you like, not only what you'd store in a DWH, but variably structured (log files in csv) as well.
Basically if you can store the data in file storage in a file format you can store it in a datalake.

like

Raw data, in whatever form, is your water. It flows into the data lake. The data lake, unfortunately, has a bunch of water from farms with pesticides and fertilizer, fish shit, random plants, and a bunch of other junk that means you can't consume it yet. But it's a place the water can sit until it flows into your water treatment plants and from there to places where it can actually be used and drank and etc.

funnylikesmart

“If you think of a datamart as a store of bottled water – cleansed and packaged and structured for easy consumption – the data lake is a large body of water in a more natural state. The contents of the data lake stream in from a source to fill the lake, and various users of the lake can come to examine, dive in, or take samples.” That analogy helped me. Source - https://www.forbes.com/sites/bernardmarr/2018/08/27/what-is-a-data-lake-a-super-simple-explanation-for-anyone/

like

Data Lake = Object storage that stores structured and unstructured data in its raw state + Data catalog to index this data + (optionally) querying capabilities + Security and Access control
One combination - Amazon S3 + AWS Glue + AWS Athena
Another combination - HDFS + Spark
There are also managed Data Lake such as AWS Lake Formation and Delta Lake.

like

Data lake is where you dump all your raw data, of all forms. Landing zone, if you will. You can't use a DB as an LZ since it has limitations of schema or file types or whatever.

like

+1 to everything others have posted. Also, your compute is decoupled from storage, giving you more flexibility in choice of technology and spend. This is in contrast to traditional relational DBs where your compute and storage both depend on the host (server), unless you invest in a shared or replicated architecture, which is costly in itself.

like

Related Posts

Please suggest popular areas for 2BHK affordable rent in Toronto. Both my husband and I work from home and have recently moved here. Have stayed in basement accommodation for over 6months. Would like to have a slightly bigger place.
Any recommendations please. Thank you!

Kind of love my main client. She lowkey texts me when she steps away from calls to pee.

likefunny

Which is good to join..Publicis sapient or LTI?

like

What is it like to work in a core immigration role at infosys Bpm.

like

Is Deloitte USI specifically in Oracle Cloud Technology, Is it doing layoffs? Level Senior Consultant

like

Did BCW NY just layoff it’s EA staff?! Seeing a bunch of these names on Connect and very upset that great people were let go!

like

What’s the general guidance on the order to maximize retirement contributions for tax benefits for an normal individual and for a self employed/small business owner before moving to taxable accounts or alternate vehicles like cash value life policies?
I’ve seen for individual: match trad 401k->match roth 401k->max roth IRA->max trad IRA->max trad 401k->max roth 401k->max HSA
For self employed: max SEP IRA/solo 401k->max roth IRA->max trad IRA->max HSA

Of course situations differ but in general?

like

Can I join in Nokia R&D unit for java, spring boot backend developer role considering current situation of layoffs in product based companies ?

Exp - 4 years
Tech stack - Java, Spring Boot, Microservices

EPAM Systems Cisco Nokia Dell Deloitte Deloitte USI Deloitte India Infosys Cognizant KPMG EY PwC Verizon Verizon Media Ericsson Huawei Technologies

like

What's your commute time to work?

like

Did anyone have a hard time transitioning into a full-time job/adulthood after college?

like

Any opportunity for a MD with Healthcare and Life Sciences Management Consulting & industry experience within this MS vertical? Majority of experience in business development and alliance management within precision medicine and real world data analytics working directly with pharma companies. Look forward to hearing back.

like

This one time I had to go to the union just to get a beard net while I worked in a kitchen.

Hi All,

I am looking for a job in automobile after sales position. Please help me/refer me for any opportunity.

I have experience of 9 years in handling after sales operations.

like

Hi fishes, could any one help me on how to withdraw of money and how much can be withdrawn ?

like

Looking for fresher jobs for my sister preferably in cloud domain. If any leads there please share .Amazon

like

Can someone from KPMG India let me know what are the roles and responsibilities of a FRM consultant.

What sort of projects comes in for a FRM consultant.

like

Has anybody made the jump from agency to Microsoft? Job is in Digital Production working w/ the Cloud/AI team. $ is on par. Requires leaving a full time role (rather quickly) for a 18mo contract. 🤔

Any recent male junior analyst or analyst or new joiner at Boston Consulting Group looking for a flatmate/flat in Bangalore? DM me. Thanks.

like

Hi all. I am actively looking for a job. Have 16 of total experience. 14 years in SCM domains like order management, order fulfilment, reverse logistics, purchase requisition ans 2 years in customer care. I am experienced in people management as well. In Ciena I was involved in business transitions, business operations, hiring and trainings. Good knowledge of Oracle, SAP, Sales force and MS office. Available to join immediately.

like

Additional Posts in Data & Analytics Consultants

What biotech stocks are yall trading right now?

GILD, INO, MRNA

like

What is Accenture’s data scientist interview process like?

like

Quick survey: Do you guys think having a template package to envelope an ML model as an application would be helpful. The package could contain boiler plate template for unit test, docker image, CI/CD, database connectors etc and act as a starting point for doing that. Tech firms usually have that internally but limited options in open source.

like

Newish to data analytics. Have some experience in power BI, sql, and R. I wanted to get some exposure to aws and saw the AWS cloud practitioner as an available course at my job. Is this cert worth it? Or should I try one of the more advanced ones ? The data analytics one for example but wasn’t sure if it would be difficult to get through as a beginner. Will it look good on my resume?

like

Anyone know of an alternative to StackOverflow for asking tech questions? The community is super toxic and I'm looking for a place I can casually ask questions that might seem dumb without being hrangued for not providing an essay proof of the research I've already done. Or some sort of buddy system where I can ping a person of relevant experience those questions. It's better to poke someone's brain for a minute than to spin in circles reading docs/dead end articles sometimes.

like

Any BCG Gamma fish? Can you please share what kind of analytics projects do you guys do?

likehelpful

In the most unnecessarily complicated phrasing worthy a dissertation, what are you working on this week?

like

Anybody interviewed for a data engineer role for Facebook? I have a SQL and Python coding interview next, any idea on what to expect?

like

Meeting with a recruiter at one of the FAANG companies. I won’t be able to move jobs because of H1b /GC at least for the next year. Any idea if I should start off with that or should I just try to understand the work that they have been doing?

like

This might be a long shot ... but I have a Statistics assignment due for tonight in a few hours

Does anyone know how to do Linear Discriminant Analysis with more than one predictor - by hand? I know how to compute in R but my professor wants it done by hand .... fml

likefunny

Anyone have a ballpark range on dataiku price per seat or however they price? It looks like an interesting product.

When industries refer to business intelligence engineers, are they looking at a senior level or entry level? Some of these job descriptions seem rather outlandish in terms of years of experience required like 5+

like

How is Deloitte’s Analytics and Cognitive practice doing? Saw a couple of open Data Science roles open on their website, are they actively hiring

like

What are some interesting non-tech/insurance/advertising/consulting careers where data science is applied? Thinking of epidemiology and economics

like

Anyone in the KPMG Lighthouse practice that would be willing to take a look at my husband’s resume and put in a referral? He’s interested in the Associate, Data Scientist position.

I was hired a few months ago by D to do data science. I am not doing data science currently, is there a way I can switch roles internally? Should I talk to my coach? Or should I just start looking for a new job. I’m starting to feel that the field is so hot rn my small stint at D won’t be a big deal to other firms.

like

from God import pandemic
From China import coronavirus, mortality_rate

covid_19 = pandemic.coronavirus.set_mortality(mortality = 1.)

like

Is anyone working with Elastic at DoD?

Single book recommendation for user research?

Goal is interview prep. Ideally focused on applications in tech.

like

Folks applying for Data and Analytics roles in the industry at M, SM, D Levels, what technical/functional topics do you prepare for on the Analytics side?

like

New to Fishbowl?

Download the Fishbowl app to
unlock all discussions on Fishbowl.
That was just a preview…
Sign Up to see all discussions
  • Discover what it’s like to work at companies from real professionals
  • Get candid advice from people in your field in a safe space
  • Chat and network with other professionals in your field
Sign up in seconds to unlock all discussions on Fishbowl.

Already a user?
Login here

Share

Embed this post

Copy and paste embed code on your site

Preview

Download the
Fishbowl app

See what’s happening in your industry
from the palm of your hand.

A phone with Fishbowl app

Scan your QR code to download
Fishbowl app on your mobile

Download app

Sign up for free to view this conversation on Fishbowl

By continuing you agree to Terms of Use and Privacy Policy

Already have an account? Log in

Sign up for free to continue using Fishbowl

By continuing you agree to Terms of Use(New) and Privacy Policy(New)
Messaging rates may apply

Already have an account? Log in

For account settings, visit Fishbowl on Desktop Browser or

General

Legal