Big Data and Cloud Tips

Rant about Big Data, Cloud and related technologies.

Pages

  • Home
  • Container Articles
  • AWS Articles
  • Trainings
  • Ecosystem
  • About Praveen

MachineLearning

Lectures/Videos/MOOC

Machine Learning Summer School held in Purdue, 2011

Course (CS229) -- taught by Professor Andrew Ng

ML by Andrew Ng

A real Caltech course, not a watered-down version

Statistical Learning from Stanford

ML by Hillary Mason (1, 2)

Books

Building Machine Learning Systems with Python

An Introduction to Statistical Learning with Applications in R (free)

Blogs (companies)

http://www.datarobot.com/blog/

http://blog.kaggle.com/

Blogs (individual)

http://shapeofdata.wordpress.com/

Data

UCI ML Repository

Datasets curated by Peter Skomoroch

Use Cases

Neural Network for Breast Cancer Data Built on Google App Engine

Interesting companies to follow

WIP
Email ThisBlogThis!Share to TwitterShare to FacebookShare to Pinterest

No comments:

Post a Comment

Home
Subscribe to: Posts (Atom)
Follow @praveensripati

Total Pageviews

Blog Archive

  • ▼  2019 (21)
    • ▼  December (1)
      • Running Containers on K8S using AWS Fargate
    • ►  November (4)
    • ►  August (1)
    • ►  July (1)
    • ►  April (2)
    • ►  March (11)
    • ►  January (1)
  • ►  2018 (12)
    • ►  September (8)
    • ►  August (2)
    • ►  July (2)
  • ►  2017 (44)
    • ►  November (1)
    • ►  October (15)
    • ►  August (1)
    • ►  July (1)
    • ►  June (5)
    • ►  May (3)
    • ►  April (15)
    • ►  March (3)
  • ►  2016 (16)
    • ►  October (11)
    • ►  April (2)
    • ►  March (3)
  • ►  2014 (41)
    • ►  May (4)
    • ►  April (12)
    • ►  March (4)
    • ►  February (5)
    • ►  January (16)
  • ►  2013 (50)
    • ►  December (12)
    • ►  November (6)
    • ►  October (8)
    • ►  September (1)
    • ►  August (1)
    • ►  July (7)
    • ►  June (4)
    • ►  April (4)
    • ►  March (2)
    • ►  February (4)
    • ►  January (1)
  • ►  2012 (51)
    • ►  December (3)
    • ►  November (3)
    • ►  October (4)
    • ►  September (2)
    • ►  August (1)
    • ►  July (7)
    • ►  June (3)
    • ►  March (9)
    • ►  February (6)
    • ►  January (13)
  • ►  2011 (36)
    • ►  December (13)
    • ►  November (14)
    • ►  October (5)
    • ►  September (4)

My Blog List

  • Cloudera Engineering Blog
    The Ethics of AI Image Recognition - The use of artificial intelligence (AI) for image recognition offers great potential for business transformation and problem-solving. But numerous respon...
    8 hours ago
  • All Things Distributed
    The power of relationships in data - Have you ever received a call from your bank because they suspected fraudulent activity? Most banks can automatically identify when spending patterns or ...
    2 days ago
  • AWS Blog
    AWS Links & Updates – Monday, December 9, 2019 - With re:Invent 2019 behind me, I have a fairly light blogging load for the rest of the month. I do, however, have a collection of late-breaking news and li...
    3 days ago
  • Perspectives
    2019 SIGMOD Systems Award - At SIGMOD 2019 in Amsterdam last month it was announced that the Amazon Aurora service has been awarded the 2019 SIGMOD Systems Award. From the awards comm...
    5 months ago
  • Hortonworks
    How Data Analysis in Sports Is Changing the Game - Any baseball fan knows that data analysis in sports is a big part of the experience. This article looks at how everyone from baseball to football teams a...
    8 months ago

@StackOverflow

profile for Praveen Sripati at Stack Overflow, Q&A for professional and enthusiast programmers

Subscribe To

Posts
Atom
Posts
All Comments
Atom
All Comments

Follow by Email

Getting Started

Labels

abstraction (1) analyics (1) apache (5) artificialintelligence (1) athena (2) aws (55) aws-summary (1) azure (1) bigdata (13) bigtop (4) billing (2) blogging (5) book (5) books (2) bsp (1) certification (6) challenges (2) cli (1) cloud (19) cloudera (4) cluster (3) comic (2) console (1) containers (3) database (1) digital (1) docker (3) ebs (2) ec2 (7) eclipse (3) ecosystem (3) education (4) elb (1) email (1) emr (5) enterprise (1) etl (1) examples (1) faas (1) flume (4) foss (1) gcp (1) gettingstarted (33) giraph (2) google (3) guest (1) HA (4) hadoop (48) hama (2) hardware (1) hbase (11) hcatalog (1) HDFS (7) hive (9) impala (2) installation (7) integration (1) iot (4) istio (2) jenkins (1) jobs (1) kids (1) kubernetes (20) lambda (4) linux (2) log4j (3) machinelearning (6) mahout (2) mapr (1) mapreduce (18) microsoft (1) mobile (1) models (1) monitoring (1) mooc (1) mr (1) mrv2 (6) NoSQL (3) oozie (5) orc (1) others (35) pagerank (1) papers (2) parquet (2) patterns (1) performance (1) pig (6) poc (2) prerequisite (1) privacy (1) productivity (1) programming (1) projects (1) Prometheus (1) python (8) R (4) raspberry (1) RDBMS (1) rdd (1) realtime (2) recomendations (2) releases (2) review (5) s3 (4) salary (1) samza (1) screencast (2) sdk (1) security (10) serverless (3) service (1) ses (1) social (1) spark (11) Spinnaker (1) sql (1) sqoop (1) sqoop2 (1) ssh (1) standards (1) stem (1) storage (1) storm (2) streaming (2) testing (1) tez (1) tips (1) training (2) twitter (2) ubuntu (8) upgrade (1) use-cases (5) vendors (1) videos (1) virtualbox (6) virtualization (1) vs (11) webinars (2) website (1) WhatsWrong (2) windows (4) yarn (1)

Popular Posts

  • Comparing ORC vs Parquet Data Storage Formats using Hive
    CSV is the most familiar way of storing the data. In this blog I will try to compare the performance aspects of the ORC and the Parquet for...
  • Running Containers on K8S using AWS Fargate
    Container orchestration is all in hype. And there are different ways of running containers on AWS, using either EKS or ECS . EKS uses K8S ...
  • How the Capital One hack was achieved in the AWS Cloud?
    DISCLOSURE : The intention of this blog is NOT to help others hack, but to make sure they can secure their applications built on top of AW...
  • Converting csv to Parquet using Spark Dataframes
    In the previous blog , we looked at on converting the CSV format into Parquet format using Hive. It was a matter of creating a regular tabl...
  • Analyse Tweets using Flume, Hadoop and Hive
    Note : Also don't forget to do check another entry on how to get some interesting facts from Twitter using R here . And also this entr...
  • Different ways of configuring Hive metastore
    Apache Hive is a client side library providing a table like abstraction on top of the data in HDFS for data processing. Hive jobs are co...
  • Changes to the AWS EC2 Instance Metadata Service (IMDS) around the recent Capital One hack
    Captial One Bank ( 1 ) and 30 different organizations were hacked around end of July, I have written a blog ( 1 ) around the same time on h...
  • Debugging a Hadoop MapReduce Program in Eclipse
    Note : Also don't forget to do check another entry on how to unit test MR programs with MRUnit here . And here is a screencast for th...
  • Making USB visible in the VirtualBox Guest
    For many of you who might been be following by blog might know that I use Ubuntu 12.04 32-bit as my primary desktop and use Window 7 guest ...
  • Processing the Airline dataset with AWS Athena
    AWS Athena is a interactive query engine to process the data in S3. Athena is based on Presto which was developed by Facebook and then op...