Blog Posts
-
Ever had a conversation with ChatGPT and wondered, “How does this thing know what to say?” You’re not alone. These AI systems seem almost magical; they can write poetry, debug code, explain quantum physics, and somehow know exactly what you…
-
Random Forest: The Swiss Army Knife of Machine Learning
So you’ve heard about Random Forest and you’re wondering what all the fuss is about? Well, buckle up because we’re about to dive into one of the most reliable and versatile algorithms in the machine learning toolbox. What’s This Random…
-
All About Tableau CRM
A quick explanation of Tableau CRM.
-
Multiple Regression with Scikit-learn: When One Variable Isn’t Enough
So you’ve mastered simple linear regression and you’re feeling pretty good about yourself. You can predict house prices based on square footage, estimate salaries from years of experience, and impress your friends at parties with your newfound ML skills. But…
-
Embedded Analytics with Tableau: Bring Insights Into Your App or Website
How to embed Tableau vizualizations into webpages or apps.
-
Tableau Prep vs Alteryx: Which is Best?
So you’re drowning in messy data and need a lifeline? Welcome to the club! If you’re trying to decide between Tableau Prep and Alteryx for your data prep needs, you’ve come to the right place. Let’s break down these two…
-
Performance Optimization: Speeding Up Large Tableau Workbooks
Tips to optimize performance of Tableau workbooks.
-
Getting Started with TabPy: Bringing Python Magic to Your Tableau Dashboards
Start using Tableau’s free python server.
-
Infer Schema when Importing to PostgreSQL or MySQL
Postgres and MySQL are great Open Source databases in use by organizations of all sizes. Both of them are flexible and somewhat scalable. Both of them have nice GUI front-ends available to make them even easier to use. Sequel Pro…
-
Getting Logistic Regression Right with Scikit-Learn
So you want to do some logistic regression? Cool! It’s like linear regression’s slightly more complicated cousin who went to business school. Instead of predicting continuous values, logistic regression predicts probabilities and categories. Perfect for questions like “Will this email…
-
Data Warehousing with Hadoop
Almost from the moment Hadoop was first introduced, organizations have sought to replace their expensive data warehousing systems with it. Hadoop’s distributed nature and the fact that it uses commodity hardware make it cheap, massively scalable, and highly available. However,…
-
Using SQL Commands in Spark with SparkSQL
Spark has become a standard for performing analysis on huge amounts of data due to its distributed nature. SparkSQL evolved as a necessary component of Spark due to the need for working with structured data.There are many times when there…
-
Importing Word Docs into Rapidminer
On a project for a recent client I needed to apply some common Natural Language Processing (NLP) techniques to surveys they had gathered, but one of the requirements for the project was that the source document had to remain in…
-
Using Seahorse for Spark on a Cloudera HA Cluster
I’m loving Seahorse, a GUI frontend for Spark by deepsense.io. The interface is simple, elegant, and beautiful, and has the potential to significantly speed up development on a machine learning workflow by its drag-and-drop nature. Thus far I haven’t run…
-
Installing MySQL from Scratch
You’ll probably see a lot of CSV files in the workplace, or generate them from the vast ocean of spreadsheets that are floating around the average office. But that won’t always be the case, and sometimes you’re going to need…