Shyam's Slide Share Presentations

VIRTUAL LIBRARY "KNOWLEDGE - KORRIDOR"

This article/post is from a third party website. The views expressed are that of the author. We at Capacity Building & Development may not necessarily subscribe to it completely. The relevance & applicability of the content is limited to certain geographic zones.It is not universal.

TO VIEW MORE CONTENT ON THIS SUBJECT AND OTHER TOPICS, Please visit KNOWLEDGE-KORRIDOR our Virtual Library

Sunday, April 19, 2015

Data science demands elastic infrastructure 04-19

Data science demands elastic infrastructure




Those companies that try to run big data projects in data centers may be setting themselves up for failure. Matt Asay explains. 

As companies struggle to make sense of their increasingly big data, they're laboring to figure out the morass of technologies necessary to become successful. However, many will remain stymied, because they keep trying to fit a necessarily fluid process of asking questions of one's data with outmoded, rigid data infrastructure.
Or as Amazon Web Services (AWS) data science chief Matt Wood tells it, they need the cloud.
While the cloud isn't a panacea, its elasticity may well prove to be the essential ingredient to big data success.

How much cloud do I need?

The problem with trying to run big data projects within a data center revolves around rigidity. As Matt Wood told me in a recent interview, this problem "is not so much about absolute scale of data but rather relative scale of data."
In other words, as a company's data volume takes a step function up or down, enterprise infrastructure can't keep up. In his words, "Customers will tool for the scale they're currently experiencing," which is great... until it's not.
In a separate conversation, he elaborates:
"Those that go out and buy expensive infrastructure find that the problem scope and domain shift really quickly. By the time they get around to answering the original question, the business has moved on. You need an environment that is flexible and allows you to quickly respond to changing big data requirements. Your resource mix is continually evolving--if you buy infrastructure, it's almost immediately irrelevant to your business because it's frozen in time. It's solving a problem you may not have or care about any more."
Success in big data depends upon iteration, upon experimentation as you try to figure out the right questions to ask and the best way to answer them. This is hard when dealing with a calcified infrastructure.

A eulogy for the data center?

Of course, it's not quite so simple as "all cloud, all the time."
Data, it would seem, has to obey fundamental laws of gravity, as Basho CTO Dave McCrory told TechRepublic in an interview:
"Big data workloads will live in large data centers where they are most advantaged. Why will they live in specific places? Because data attracts data.
"If I already have a large quantity of data in a specific cloud, I'm going to be inclined to store additional quantities of large data in the same place. As I do this and add workloads that interact with this data, more data will be created."
Over time, enterprises will look to the public cloud for all the reasons Wood describes, but legacy data is unlikely to make the migration. There's simply no reason to try to house old data in new infrastructure. Not most of the time.
But some companies will find that they're more comfortable with existing data centers and will eschew the cloud. I'm not talking about hide-bound enterprise curmudgeons that shout "Phooey!" every time AWS is mentioned, either. No, sometimes the most data center-centric of companies will be the innovators like Etsy.
As Etsy CTO Kellan Elliott-McCrea informed TechRepublic, once Etsy had "gained confidence" in its ability to manage its Hadoop clusters (and other technology), they brought them in-house, netting a 10X increase in utilization and "very real cost savings."
Nor is Etsy alone. Other new-school web companies like Twitter have opted to run their own data centers, finding that this gives them greater control over their data.

You're no Twitter

As highly as you may estimate your abilities, the reality is that you're probably not an Etsy, Twitter, or Google. As painful as it is to say it, most of us are average. By definition.
This is what Microsoft's great genius was: rather than cater to the Übermensch of IT, Microsoft lowered the bar to becoming productive as a system administrator, developer, etc. In the process, Microsoft banked billions in profits, helping make a good sysadmin better or a decent developer good.
Regardless, all enterprises need to establish infrastructure that helps them to iterate. Some, like Etsy, may have figured out how to do this in their data centers--but for most of us, most of the time, Wood's advice rings true: "You need an environment that is flexible and allows you to quickly respond to changing big data requirements."
In other words, odds are that you're going to need the cloud.

14 comments:

  1. Mindblowing blog appreciating your endless efforts in developing a truly transparent content. Which probably the best one to come across disclosing the content which people might not aware of it. Thanks for bringing out the amazing content and keep sharing more further.

    360DigiTMG PMP Certification Course

    ReplyDelete

  2. Such a very useful article. Very interesting to read this article.I would like to thank you for the efforts you had made for writing this awesome article.

    business analytics course

    ReplyDelete
  3. Really well-written and informative blog. Thanks for sharing this awesome blog. I really liked it. I hope keep sharing some more articles again quickly.
    Data Science Course in Hyderabad

    ReplyDelete
  4. This is a very nice one and gives in-depth information. I am really happy with the quality and presentation of the article. I’d really like to appreciate the efforts you get with writing this post. Thanks for sharing.
    Python classes in Amravati

    ReplyDelete
  5. Learn to master Data Science in real-time by doing hands-on exercises on real-time data science projects with the Data Science Training in Hyderabad program by AI Patasala.
    Data Science Institutes in Hyderabad

    ReplyDelete
  6. Extremely overall quite fascinating post. I was searching for this sort of data and delighted in perusing this one. Continue posting. A debt of gratitude is in order for sharing.data scientist course in warangal


    ReplyDelete
  7. This is nice and informative, containing all information also has a great impact on the new technology. Thanks for sharing it,
    full stack developer course

    ReplyDelete
  8. The new wave of innovation that is changing the way people do business is called data science. Gain expertise in organizing, sorting, and transforming data to uncover hidden patterns Learn the essential skills of probability, statistics, and machine learning along with the techniques to break your data into a simpler format to derive meaningful information. Enroll in Data science in Bangalore and give yourself a chance to power your career to greater heights.

    Data Analytics Course in Calicut

    ReplyDelete
  9. Learn to use analytics tools and techniques to manage and analyze large sets of data from Data Science training institutes in Bangalore. Learn to take on business challenges and solve problems by uncovering valuable insights from data. Learn from the comprehensively designed curriculum by the industry experts and work on live projects to sharpen your skills.


    Data Science Training in Jodhpur

    ReplyDelete
  10. 360DigiTMG provides you the best Data Science Course in Bangalore, with excellent training from the best trainers in the field and real-time projects, soon you will be an expert in the domain with the highest paid job.

    Data Science in Bangalore

    ReplyDelete
  11. Acquire a firm grounding in the theory of Data Science by signing up for the Data Science courses in Bangalore. Master the relevant skills along with all the essential tools and techniques of Data Science. Get to avail benefits like Flexible timings, Best industry trainers, and a meticulously crafted curriculum with hands-on projects that will give you exposure to a real-world working environment.

    Data Scientist Course in Delhi

    ReplyDelete
  12. Really impressed! Information shared was very helpful Your website is very valuable. Thanks for sharing.
    Food Processing Consultants

    ReplyDelete
  13. avefrom.net is an online portal well-known for downloading videos from prominent video-sharing websites like Facebook, YouTube, Vimeo, Dailymotion VK.com, Veojam, and many more. en.savefrom.net remove

    ReplyDelete
  14. Thanks! Very interesting to read. This is really very helpful. Data Science Course In Lucknow

    ReplyDelete