Random Forests, a type of classification method, are widely adopted in Data Science for good reason. Random Forests in Python are particularly easy … [Read more...] about Random Forests in Python
Blog
Linear Regression in Python
Linear Regression in Python is easy to implement and is useful for predicting continous numbers (such as forecasting revenues or predicting a house … [Read more...] about Linear Regression in Python
Cross-Validation in Scikit-learn
Cross-validation in Scikit-learn is important because it gives us not only the means to train our model, but also to score its effectiveness. Without … [Read more...] about Cross-Validation in Scikit-learn
Data Warehousing with Hadoop
Almost from the moment Hadoop was first introduced, organizations have sought to replace their expensive data warehousing systems with it. … [Read more...] about Data Warehousing with Hadoop