I am thrilled to announce the first tech demo of the new WebFusion (codenamed odin) is completed.
The address of the demo is: http://www.shisoft.net/
This demo exhibits the function of recommend feeds (from about 1,000 twitter accounts I followed in this demo) by user selected categories. This function is an imitation of mobile app Zite, which have been acquired by Flipboard. You can simply type your thoughts in the text box, the system will give you hints automatically.
You can then use your mouse or keyboard to select the item. If you prefer to let the system suggest categories combined feeds, feel free to select more. Then, click the search button to submit your query.
Your recommendation of feeds should be displayed in no time. In the image above, you can see four feeds from twitter. You can also scroll the page down to get more until there is no more feeds in the database that related to your selected categories. In your attempts, the results should vary because the system is receiving and processing new feeds published by other persons in real time, you can always see the newest feeds once it arrived in the system.
This system also supports recommend feeds by user inputs that categories does not exists in my database (updated at 8 January 2016). Every internal categories was prepended with symbol "♦", indicates that this category has comprehensive features in my database that the system can provide accurate recommendations. A category that does not prepended with the symbol indicates that the system will use features according to the words user inputed to provide recommendations which it may not very accurate compare to internal categories.
You can switch categories by input new names in the text box, or you can simply click on the links below the feeds that indicated as "Related Categories".
This is NOT a search engine. The system considers both keywords and the meaning of the contents to categories, including probability distribution on topics.
This function is still experimental. After filtering, there are over 100,000 categories that the user can select for recommendations. The data training processes are all unsupervised, it is impossible for me to check the accuracy for all of them. As far as I know, there are some flaws right now which makes it imperfect. There are a lot of overlapping categories that have almost the same features. Their results of recommendations are identical. It can distinguish Apple computer and apple on trees, but it cannot tell the difference from "Google" and "Microsoft". Because their topics are the same and their keywords are almost identical. Due to all of the categories and its data comes from Wikipedia dumps, the quality cannot be controlled, some categories are also overfitted by more popular subcategories.
This function is a part of the new WebFusion. In the demo version, "Share" and "Reply" button are not functional right now, they only use as placeholder.
The dedicated server for this demo is in Shanghai right now because this system is huge and I cannot afford virtual servers in some cloud provider like DigitalOcean. For the concern of the national great firewall, U.S users may suffer time out exceptions or service temporally unavailable. I am trying to overcome these problems. If you have any difficulties on reviewing this demo, please don't be hesitated to contact me.
* Right now I have made my best efforts on improving the experience in your visiting this demo by redirecting traffic through 3 more servers. The outbound server is in San Francisco with this blog site, the origin server is still in Shanghai.
* I have already use shadowsocks to replace ssh for internal proxy service to twitter and other banned feed source in China. But the route to my shadowsocks server is also not stable. I tried to solve this issue by using HaProxy to load-balance and failover 9 different routes to the shadowsocks server in U.S, it works.
* It have been a long time after I first start to collect twitter feeds and compute their features. I adjusted the parameters which slightly more strict so that the recommendations for categories should be more accurate.
* You can get more technical details from here