Abstract

We used network statistics of selected amenities in order to classify income class and predict 2017 fiscal data of cities and municipalities in the Philippines. Edges between amenities based on a distance threshold of 500 meters were created, and network statistics subsequently harvested. For income classification, k-NN, Decision Tree, Random Forest, Gradient Boosting Method (GBM), and Feed Forward Neural Network algorithms were used.

The highest classification accuracy was 34.15%, beating the random chance threshold of 21.35% given 12 classes and imbalanced sample sizes. For regression of fiscal data, only the first four algorithms were employed. GBM yielded the highest r-squared values of 0.56 and 0.62, for income from internal and external sources, respectively. Although number of amenities (or nodes) dominated top predictors, there were limited instances where network statistics such as max degree and edge counts were included as such.

Our work aims not only to further demonstrate the potential of using network statistics as predictive features, but also to identify which amenity network statistics are important for the wealth of cities and municipalities, at least in the Philippines.