The challenge is to blend together models from different analytics platforms - i.e. Python , R, and KNIME - to create an ensemble model. Data is the “airline data set” (http://stat-computing.org/dataexpo/2009/the-data.html) enriched with additional external data , such as cities, daily weather (https://www.ncdc.noaa.gov/cdo-web/datasets/), US holidays, geo-coordinates, airplane maintenance. DepDealys is used as the target variable.
This workflow shows how the random forest nodes can be used for classification and regression tasks. It also shows how the "Out-of-bag" data that each random forest learner calculates can be used to estimate the accuracy of a random forest.
This workflow shows how the prediction fusion node can be used to combine the predictions of a naive bayes and a svm classifier.
This workflow shows how the tree ensemble nodes can be used for regression and classification tasks. Note: If you want to deploy a random forest, we recommend to use the less complex random forest nodes.