Text Classification from Forum Posts

This workflow performs a supervised topic classification on the forum posts. The training set consists of the description files of the KNIME nodes. Topic classes are the nodes top categories in the Node Repository (IO, Data Manipulation, etc ...) from KNIME versions prio to 3.0. Model is built on this training set and applied to forum posts. Top three topics with highest probability are chosen for the post topic class. A Tree Ensemble is used as classification model.

Parsing the KNIME Forum

This workflow demonstrates how one can parse the KNIME Forum. We work in different stages. First we get read the list of topics from the fron page of the forum. Afterwards we go to each category separately. In each category we are searching for all topics which are newer than 9 days. This limitation is done mainly to speed up the workflow. If there is a next page avaiable, we also take those into consideration. After parsing the different thread pages, we are reading all information from the individual topics.

Subscribe to Forum Analysis of the KNIME Forum