Executive Summary

The Senate of the Philippines has a website that houses Senate bills (around 15,000) from the 13th until the 18th Congress of the Philippines. We found out that the only way to navigate and search through bills is by searching by using the keywords or the bill number. This would be very difficult for researchers and citizens to use unless they have a deep understanding of the bills and memorized their keywords or bill numbers. That is why for this paper, we propose two methods to help in searching and exploration of Senate bills. Both methods used data from all of the Senate Bills present in the Senate Website with features such as title.

The first method is using Information Retrieval (IR) System that would retrieve relevant bills based on the closest bills related to your search term. This is done using TF-IDF on the titles of the bills and Euclidean Distance to get the top K number of closest bills. We found out that this IR system returns relatively relevant results as we found that out of the 5 search terms we used, 4 of them have 80% precision based on manually checking which results are relevant or not.

The second method is using clustering. We used Hierarchical clustering using Ward’s method to cluster the title of the bills. We tried two-level (general and more-specific) clusters that would be a good system for navigation. Interestingly, the clusters do not seem to have levels and they can be interpreted just using the more-specific method of clustering using a lower ΔΔ (delta). We found 17 meaningful groups of bills that can be used for navigating the large number of bills from the Senate Website.