Crashing the G-WAN web server, at the speed of light

Recently I signed a job, working as a humble software engineer in a cloud gaming company at their server team.

My employer, which is also a software engineer, mostly working on game engines, writing C++ and C#, holding some game demo I.P (hopefully) and claimed that he solved the problem of building a profitable 3D cloud gaming servers, which is the major technical problem that would impact their business model (again, hopefully, most of the cloud gaming providers got bankrupted or get low profit due to server and bandwidth expense). Briefly, he has no any actual experience on how to build a cloud server platforms. That's why I thought I might be able to help him building their products that is robust and scaleable.

Although I write Java and Clojure code. I don't detest C/C++ and any other programmings that would compile to machine code for performance. Actually, I love high performance, and I really expect to learn anything about them form the job.

They have a great vision, and guaranteed they would sell their company for 3 billion dollars and everyone in the company would got 5 million dollars.

Lookin good, "what should I do?" I asked. "We have found ultimate solutions", they smiled in mysterious. I was really intrigued to know what secret they found. "We are asking you to write our web server in C++ programming language". Sounds challenging, and I like challenges in my work. Later they presented me a web server that I have never heard: G-WAN.

They told me to do some research on this one (another software is ArangoDB, that was fine). Because they did't tell me what I am actually going to do in the project, I started from searching on the internet for introductions and documentations.

The first thing I investigated is their web site.

Snip20160607_1

The web page design looks decent. What they focus on the front page is like "It's the era of multicore computing". I surely know that, the words looks are just written for those who don't have experiences on server programming and project managers.

What's next? In the about page, I expected some more detail information about how this software works. Instead, I got this.

Snip20160607_3

I smell taste of mouldy medals. Then I tried to look into this website and trying to find more technical from this web site. In the FAQ page, I started to felt strange and disturbing.

Why? Because non of the cases in the FAQ use modern hardware and operating systems. Instead there are plenty of ancient and outdated configurations. Like more faster and scalable Ubuntu 10.04, their usually tested CPU is Intel Xeon W3680 @ 3.33GHz (6 Cores). I start to wondering am I just jumped through to the time that I have just enrolled my undergraduate school. The web site is more like a fraud to me. But I may be wrong, so I start to find how many servers are powered by G-WAN.

Shodan can do this job for me, and it did give me an interesting result. As a server have released for about 8 years ago, there are only approxmy 21 HTTP servers online and most of them are serving static contents or totally abandoned.

Snip20160607_4

 

I stopped to take deep look on this project because I know there must be some reason for this general purpose server not to be accepted and got so few users, because even the Clojure http server http-kit got at least 800 sites, it is young and not the most popular one in the small Clojure community.

I start to search the server name in Google. There is not much about it, but there are some threads from Reddit, Hacker News.

https://www.reddit.com/r/linux/comments/2mgo7o/wtf_is_gwan_web_server/

https://news.ycombinator.com/item?id=4109698

https://news.ycombinator.com/item?id=8130849

and some blogs

https://riccardo.forina.me/why-i-ll-never-trust-g-wan/

http://tomoconnor.eu/blogish/gwan-snakeoil-beware/

and Wikipedia debate

Most of them are really critical and I don't know how much they suffered to get angry like this. What is the worst thing can possible go wrong on a web server? Finally I found out the answer later by myself.

I reported my research to my boss and telling them I am not recommend this software in our projects. But it seems they don't agree with that and tells me to do the test and make my decision.

Unfortunately, I got really bad stomachache that day an have to stay at home and waiting for my No.2 to come out at any time (sorry for the description). And I also realised that there is no qualified server grade equipments in the office and the only way to perform a comprehensive test is to use my own cluster that I was only use do my own develop and run my own project. Because I really want to know how fast G-WAN can be, I suggested to stay at home test G-WAN and other candidates on my equipments.

Then I totally destroyed any my last expection on the software. For the record, I posted the results from Apache Bench and Weighttp to the GitHub Repo. I have to say, it is not  a complete test, but I am pretty sure G-WAN is totally and utterly a bad choice to any projects that are not toys.

Because it crashed under 1k concurrency test, in a second.

Why it crashed? I am wondering. I started to look for any clues, but nothing left in the log, except this:

Could not attach to process. If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf
ptrace: Operation not permitted.
No frame selected.
Signal : 11:Address not mapped to object
Signal src : 1:.
errno : 0
Thread : 10
Code Pointer: 000000404f4c module:gwan function:?? line:0
Access Address: 000000000018
Registers : EAX=ffffffffffffffff CS=00000033 EIP=000000404f4c EFLGS=000000010246
 EBX=7fa0b19ff000 SS=00000000 ESP=7fa0d8ffa170 EBP=7fa0d8ffa1c0
 ECX=000000000008 DS=00000000 ESI=0000ffffffff FS=00000033
 EDX=000000000008 ES=00000000 EDI=7fa0b1a009b0 CS=00000033
Module :Function :Line # PgrmCntr(EIP) RetAddress FramePtr(EBP)
No available stack frames
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[1] 31224 segmentation fault (core dumped) sudo ./gwan

Still does not make any sense. What I can see from this "Segment Fault" is the developer got backfired from using dark magic reading deallocated memory address (I got almost the same problem on my graph database project when operating unsafe memory, but fixed).

I reported this exception to my boss. He said "That's must be your fault. You must use it wrong".

I panicked.  I can foreseen my miserably life, never have a chance to go home at the time I should be, playing with my cat and my PlayStation.

He may discovered and my co-worker also have no faith on developing a regular web server based on such an unstable foundation. My boss agreed for us to use golang instead.

"What a nice day" I think. "But I will use it on my own", he said. I smiled at him and did't say anything else.

So, why they are so crazy about this. It seems most of the people are attracted by it's amazing compare chart.

local_nginx_lighty_gwan_100k

 

G-WAN is above 2x faster than any other web servers, including Nginx, the one that considered as the most fast and widely-used web server in the world, looks like crap in the chart. If that was true then those open source authors are either dumb-ass or G-WAN guys are really genius. But truly, they are not.

They are even not close to their claim. According to my test results, G-WAN is only at most 30% faster than Jetty. Compare to golang, G-WAN is 100k qps against 70k qps goroutine without any frameworks. But when you consider start to build something upon G-WAN, it is going to a nightmare.

I am not trying to persuade my employer to give up hope on it, because he paid for the support subscription. Looks like he trusts sales men more that his own team. Nice job on helping those rich Swiss people. He will not understand what I have been suffered until he did the same evaluation like I did.

One week after I submitted my report, those "TrustLeap" guys gives me their comment on my test and totally ignored my questions about how the crash happend. They criticize me not using a really fast operating system (which the newest one is not the fastest), my kernel version is too high, I tested their web server with X window started. But they just didn't explain anything about WHY THEIR SERVER CRASHED. They implied that the only way to run their server right is to use old OS like Ubuntu Server 10.04, which have already been stopped supported from Canonical. The is how those guys treat their customers.

I was so furious about wasting my time on such program and this problematic stuff not getting recognized by my employer. In another perspective, it is a successful project. It did attracted some people like my employer to pay for it even he did not know anything about distributed systems, scaleability and software architecture. It is also not a successful project because their trick is so obvious to professionals that can only fool newbies. I am going to end this unpleasant article with quote by voretaq from ServerFault.

I thought the IT world learned its lesson about sacrificing all in the name of performance with qmail and the rest of the djb family of things...

Things happened in recent 1 month

At last, I did quit my job.

The company I used to work for did make a lot of interests by selling mobile phone game virtual currency (About 2,600,000 USD cash flow per month), as we called gems. But they still have not figure out how to divide the incomes with their publisher. So I still have no profit sharing.

As they told me the amount of profits I will get can only cover my tuition fee for one year, that does not include the cost of food, lodging and equipments, I have low passion on dealing with my works. Instead, I start to thinking about make some design that noteworthy in my resume.

Reviewed the source code of old WebFusion project that was written in Java. I found that I can make a better architecture by using the knowledge I have learned from my employed works. Improving the performance and robustness of the whole system. So, I announced the cancellation of the old project and start new one from scratch codenamed odin. The new design will be revealed in other article in this blog.

I have meet my study buddy and best friend, they informed me that I may off schedule on the plan to prepare for language tests. That is not a good news to me but I have confidence to catch up the schedule. I know my weaknesses, what I need to do is keep practicing.

 

Hello world!

I created this blog, to take notes from my software development, daily studying and language practice. Considering the dishonoured network condition in my country, which is so-called the only communism region in this world, I setup this site in the US, avoiding unpleasants paperwork.

As you can see I am not a native English speaker. But I intend to improve my skills for better understanding of the material I read and represent my thought. You can consider it as the struggle from a young and angry man which have early midlife crisis.

A little more about myself. I am a software engineer now working for a game company which made the potential most popular League of Legends theme mobile game in China. Responsible for architecture of server clusters. I built a middleware that make the cluster scalable to bear the increasing players only with deploying a single machine. This design helped my company reduced the costs on hardware and maintenance manpower. I also built the backstage tools for operation team alone. Those works consists of business intelligence,logging system and player/content management systems. Operation team use it as reliable sources for analyzing data and decision making. When our chief server programmer was too busy, I helped him to finish some work like VIP system (which is the system for stimulating players to pay more in our game by providing various discount) and messaging system. I also skilled in iOS programming, when the client team have some trouble with native layer, I also provide supports like In App Purchase.

Before employed, I was busy on a project name WebFusion, integrating information from various web services. Providing instant messaging, micro-blogging, e-mail and content recommending systems. If you are interested, take a look at https://www.shisoft.net/. This project is paused due to lacking of time to maintain when employed.

I am not pleased with my job right now. It was interesting in the beginning because I learned new stuff and did improved my skills. But that feeling faded out when I found most of my work was something like CRUD, I found it boring. I really want to continue working on WebFusion project in my spare time but when employed, the job squeezed out my time. Building such complex system requires knowledge that I need a lot of time to learn. Then I realized that I  will never have a chance to create things I really care about. So I decide to quit my job and get a chance for overseas studying. My boss said I am crazy, but really, I DON'T CARE. I had enough with those incessant unpaid overtime works and there is no anything like profit share contract for me, so beat it!