August 7, 2012
"You’re welcome to love Android and hate Apple. Just don’t be fooled into thinking Samsung are the good guys."

(Source: kernelmag.com)

August 7, 2012

Let’s see if I post more with tumblr, I sure hope I will do.

August 15, 2009
nginx+drupal revisited

nginx

Recent nginx updates support try_files and internal location directives. These features make nginx more flexible as a web server for Drupal.

  • try_files checks for existence of files in order, and returns the first file that is found. In Drupal’s logic, try_files enables the server to check Boost-generated cache, imagecache images, and Drupal installation in order.
  • @location syntax for internal locations. Internal locations are not exposed directly via nginx. They are accessible by try_files, customized 40x messages, and rewrites.

drupal

Using try_files and @location syntax together provides an easier way to run Drupal.

location / {
  try_files $uri $uri/ @drupal;
}
location @drupal {
  rewrite ^/(.*)$ /index.php?q=$1 last;
}
location ~ \.php$ {
  fastcgi_pass    127.0.0.1:3456;
  fastcgi_index   index.php;
  fastcgi_param   SCRIPT_FILENAME $document_root$fastcgi_script_name;
  include         fastcgi_params;
}

Most FastCGI parameters are in fastcgi_params which comes by default in nginx installation.

security and performance

# protection for sensitive info
location ~ (/\..*|settings\.php$|\.(htaccess|engine|inc|info|install|module|profile|pl|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)$|^(Entries.*|Repository|Root|Tag|Template))$ {
  deny all;
}
# turn off access logs for stylesheets and scripts
location ~ \.(css\js)$ {
  access_log off;
}
# performance for images
location ~* \.(jpg|jpeg|png|gif|ico)$ {
  expires 45d;
  access_log off;
}
# fix imagecache issue because the image configuration above
location ~ /imagecache/ {
  try_files $uri @drupal;
  expires 45d;
}

July 12, 2009
Apache proxy, cache, and web service optimization

At UPEI our web pages are powered by an open-source web platform Drupal but served as static pages that are mirrored (in our terminology, scraped) by httrack to a front-end server. Most components of web pages are static except emergency messages, contact forms, and some bits of media files. All external access goes to the front-end server, while only a few requests reach our back-end server through the university firewall.

INFRASTRUCTURE

Our system is constructed by five different pieces: A front-end web server (at the same time a reverse proxy), a back-end web service and HTTP media server, a back-end production server, a development server and a database server. The front-end web server, the back-end production server, and the development server are all based on Debian Linux and an old but very stable Apache 1.3. The web service and media server is based upon a very fast and reliable HTTP server Nginx. Our database server is MySQL 5.1.

CHALLENGES

The original infrastructure has only the front-end static HTTP server and the back-end HTTP Drupal server. While most content is static on our website, we still need some dynamic content for feeds, emergency messages and forms. The back-end HTTP Drupal server handles too much PHP requests and is dying.

The major issues I am concerned about:

Performance. Our infrastructure must handle all hits for emergency situations. In other words, external access to Drupal must not rely on Apache.

Security. All external inputs must be filtered, monitored, and isolated from the production server.

Reliability. Production server down time must not affect public access.

Scalability. The infrastructure must be open to future expansion.

The bottleneck of our system was in the dynamic part.

HTTP SERVERS

The front-end server is a stable Debian Linux installation that serves all static pages and acts as a reverse proxy server to web services and legacy systems. Since our daily page views are well under 1 million per day, the server runs happily with Apache 1.3 as a static server. Small media files are reversely proxied to the back-end media servers and kept with Apache caching.

The back-end production server provides Drupal access to all content managers in the university. The development server is a sandbox server for theme development and module development. Both these two servers run on Debian Linux and Apache 1.3 and connect to separate database servers.

The media and forms server runs on Nginx to provide media file downloading/streaming and non-cacheable AJAX responses. It has restrictive access to the production database server and most POST requests are filtered and monitored. Nginx is well-known for its performance and scalability. Wordpress.com runs on Nginx as a load-balancer.

OPTIMIZATION

Compression. All texts including html files, javascripts, and css stylesheets are encoded with mod_gzip in the front-end server.

Cache in the client side. All images, and fonts are cached in the browser by Expires header and Cache-Control header for at least 45 days. ETag is properly disabled for binary content. This optimization has significant improvement for the second visit. Our home page is significantly large in size (very graphics oriented for marketing purposes). The first visit may be slow (2.58MB in size). Client-side cache, however, improves the second visit to about 30KB to 50KB. Large images are also loaded in the background instead.

Cache in the server side. Small media files are cached in the front-end server to prevent proxy access to the back end.

Home page CSS refresh issue. HTTP cache control and expires headers are used in the front-end server to make client browsers load the home page every visit.

July 8, 2009
PHP 5.3 and MySQL 5.1

Running these brand new distributions.

PHP 5.3.0-0.dotdeb.6 (cgi-fcgi) (built: Jul  2 2009 06:14:07)
Copyright (c) 1997-2009 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2009 Zend Technologies
    with Xdebug v2.0.4, Copyright (c) 2002-2008, by Derick Rethans
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 189435
Server version: 5.1.34-0.dotdeb.0-log (Debian)

February 15, 2009
Use Nginx to run your Drupal site

Type: Tutorial Difficulty: Intermediate I have a fresh website based on Apache+PHP5 to be converted into Nginx and PHP5-FastCGI. What can I do?

Stage 1 CGI version of PHP5

Nginx only supports CGI version of PHP5 (not the Apache module). In FastCGI mode, PHP5 runs like a server that forks out a number of children to handle incoming requests. This number is indicated in the start-up script. It can be any number where necessary. Of course, we would not blow up our servers, so memory_limit*number of PHP children < available memory. In Debian/Ubuntu systems, we can simply install php5-cgi in one line:
root@domU:~# apt-get install php5-cgi
This will install the CGI version of PHP5 that includes FastCGI support. Any modern Linux distribution would come with such a similar package management system. After installation, run the following command to confirm that PHP has FastCGI enabled.
root@domU:~# php5-cgi -v
PHP 5.2.4-2ubuntu5.5 with Suhosin-Patch 0.9.6.2 (cgi-fcgi) (built: Feb 11 2009 20:01:54)
Copyright (c) 1997-2007 The PHP Group
Zend Engine v2.2.0, Copyright (c) 1998-2007 Zend Technologies

Stage 2 Spawn the FastCGI server

PHP5-CGI binary supports to serve up as a FastCGI server. However, setting up the environment is complicated with PHP5-CGI binary. Instead, we can use a general FastCGI spawn-er from Lighttpd to help create the service. Download the latest version of Lighttpd from here, extract the package, run the configure script, make, and copy spawn-fcgi binary to /usr/bin.
root@domU:~/lighttpd-1.4.20# ./configure
root@domU:~/lighttpd-1.4.20# make
root@domU:~/lighttpd-1.4.20# cp src/spawn-fcgi /usr/bin
Then we can spawn the PHP5-FastCGI like this:
root@domU:~# /usr/bin/spawn-fcgi -f /usr/bin/php5-cgi\
   -a 127.0.0.1 -p 16000 -C 5 -F 2\
   -P /var/run/fastcgi-php.pid\
   -u www-data -g www-data
This command will instantiate two PHP5 FastCGI processes (each of which have 5 children) and bind them to 127.0.0.1 (localhost) and port 16000. So we have ten processes listening for PHP requests. The PHP processes run under www-data permission.

Stage 3 Build Nginx

Imagine how one man can beat the world? Nginx (Engine X) is a blazingly super fast HTTP server written by Ignor Sysoev. According to Netcraft in December 2008, Nginx serves or proxied 3.5 millions of virtual hosts in the 3rd place of the market. 2 of Alexa Top-100 sites use Nginx. Download Nginx from its official site and extract the tarball, then run:
root@domU:~/nginx-0.7.34# ./configure --with-http_ssl_module\
  --with-http_realip_module --with-http_addition_module\
  --with-http_sub_module --with-http_dav_module\
  --with-http_flv_module --with-http_stub_status_module\
  --with-mail --with-mail_ssl_module\
  --http-log-path=/var/log/nginx/access.log\
  --http-client-body-temp-path=/mnt/nginx/client\
  --http-proxy-temp-path=/mnt/nginx/proxy\
  --http-fastcgi-temp-path=/mnt/nginx/fastcgi\
  --pid-path=/var/run/nginx.pid\
  --lock-path=/var/lock/nginx.lock\
  --sbin-path=/usr/sbin\
  --error-log-path=/var/log/nginx/error.log\
  --conf-path=/database/configuration/nginx/nginx.conf\
  --user=www-data --group=www-data --with-sha1=/usr/lib
root@domU:~/nginx-0.7.34# make && make install
Nginx is configured with most useful modules. Note that —http-client-body-temp-path, —http-proxy-temp-path and —http-fastcgi-temp-path are cache directories used by Nginx. Default user and group can be configured to the system’s default user for http service instead of nobody, although they can also be configured at runtime.

Stage 4 Run Nginx

Starting up Nginx is simple and straight. After properly configuring your nginx settings, just type nginx and hit return. Then it will start. I also provide a set of Nginx configuration here to simplfy your process. There are several important pieces of code to make Drupal work under Nginx in the configuration.
# rewrite rules
if (!-e $request_filename) {
  rewrite ^/(.*)$ /index.php?q=$1 break;
}

include supercache;

# serve php files
location ~ \.php(/|$) {
  fastcgi_pass    127.0.0.1:16000;
  fastcgi_index   index.php;
  fastcgi_param   SCRIPT_FILENAME /database/www/mydomain.com$fastcgi_script_name;
  include         fastcgi_params;
}

# hide protected files
location ~* \.(engine|inc|info|install|module|profile|po|sh|sql|theme|tpl(\.php)?|xtmpl)$|^(code-style\.pl|Entries.*|Repository|Root|Tag|Template)$ {
  deny all;
}
The location context for PHP scripts makes Nginx talk to PHP FastCGI server. And the if context for rewrite makes Drupal support clean URLs. You’re done! Download Nginx configuration files References An old thread from the official Drupal forum.

December 23, 2008
A Mobile Website in Drupal

How can you set up a website for mobile browsers in five hours?

First, we have websites that have RSS output, such as UPEI’s website, so you can use Drupal to aggregate news and information from them. The mobile version should not generate content, but it serves only as an aggregator. Drupal’s cron job will automatically update feed items. UPEI’s mobile website aggregates feeds from UPEI websites, including media releases, department notices, and other feedable information.

Second, we use a mobile theme for Drupal as the basic theme for mobile browsers. This theme places blocks from top to bottom, including left sidebar, content top, and right sidebar. The navigation menu can be placed in the left sidebar. We also need to modify the template file page.tpl.php to suit our need, such as the header and footer and other signatures. We have to change

Third, we use an override stylesheet to provide extra styles for Webkit based browsers, such as MobileSafari on iPhone and Android’s browser. This stylesheet overrides font sizes and display element sizes and word break settings.

Then there is the final product (Use your iPhone!).

December 8, 2008
Amazon EC2

Amazon EC2 is an amazing service for those who want stability, scalability, and extensibility. Technically speaking, EC2 is an on-demand VPS (Virtual Private System) for which you pay when you need. EC2’s upside is that no customer service and additional payment transactions are involved if a server is “purchased.” EC2’s service is paid by instance hours. If an instance is not running, you do not need to pay for it. EC2’s instances support up to 8 cores and 17GB memory. Its elastic block store supports unlimited storage space that is pay-as-you-want.

Considering how unstable the MediaTemple (gs) that I am using, EC2 is the next round for me. EC2 provides better supported, more stable, flexible, and robust than any other VPS competitors in the market, iff you are geek-enough to use it.

December 3, 2008
Ragel State Machine Compiler

Ragel is a State Machine Compiler that supports generating code from Ragel’s regular expressions. Ragel provides code generation for C, C++, Objective-C, D, Java, and Ruby. Regular expressions and finite automata can be used in protocol analysis, data parsing, lexical analysis, and input validation. Implementing Ragel’s C code is very easy. Here is an atoi implementation for C’s standard library. It is several times faster than C standard library’s implementation.

/*
 * Convert a string to an integer.
 */

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

%%{
	machine atoi;
	write data;
}%%

long long atoi( char *str )
{
	char *p = str, *pe = str + strlen( str );
	int cs;
	long long val = 0;
	bool neg = false;

	%%{
		action see_neg {
			neg = true;
		}

		action add_digit {
			val = val * 10 + (fc - '0');
		}

		main :=
			( '-'@see_neg | '+' )? ( digit @add_digit )+
			'\n';

		# Initialize and execute.
		write init;
		write exec;
	}%%

	if ( neg )
		val = -1 * val;

	if ( cs < atoi_first_final )
		fprintf( stderr, "atoi: there was an error\n" );

	return val;
};

#define BUFSIZE 1024

int main()
{
	char buf[BUFSIZE];
	while ( fgets( buf, sizeof(buf), stdin ) != 0 ) {
		long long value = atoi( buf );
		printf( "%lld\n", value );
	}
	return 0;
}

November 11, 2008
What OO is and not

This is a small snippet in defense for OO, against Why OO Sucks.

When we talk about Object-Oriented something, we may discuss in several different aspect, such as Object-Oriented Analysis (OOA), Object-Oriented Design (OOD), or Object-Oriented Programming (OOP). OO is such an overwhelming adjective to emphasize the Object-ed way of thinking. Briefly speaking, the Object-Oriented way is to model the world with objects, including states and behaviours. A real-world entity has data and the ability to interact with itself or other objects. Such an entity can be abstracted using a set of states to represent its data, and a set of methods to model its interactions. An object will also maintain something invariant. Cats have tail. Dogs have tail. Cats do eat. Dogs do eat. They are all animals. The invariant in an object determines what class it belongs to.

OOP is a programming language derivative of the OO principle. Like other programming language paradigms, OOP is both the thinking process of creating executable code (such as object files and Java classes) and the management process of assembling executable parts (code reuse, linking). In a pure OOP environment, these two processes are fully object-oriented. Other paradigms include procedural programming paradigm, functional programming paradigm, logic programming paradigm, etc. Procedural paradigm is a flow-based model with input, buffer, data manipulators, and output. Functional paradigm is a to model everything into functions as in mathematics, therefore, every entity in FP is a function. These paradigms are just different approaches defined in the same problem scope. For average people, the Object-Oriented approach is heretofore the best way to describe the world to the computer, as speaking to human beings. Mathematicians may favour in functions. People with business education may be fond of flow charts.

As we can speak English, Chinese, French, Italian, or even Greek or Latin, we do understand different languages. However, the most efficient way to convey our thoughts is our mother tongue. OO is good enough to be English, although it may not be your mother tongue.

Liked posts on Tumblr: More liked posts »