I saw different binaries for PHP, like non-thread or thread safe?
What does this mean?
What is the difference between these packages?
I saw different binaries for PHP, like non-thread or thread safe?
What does this mean?
What is the difference between these packages?
Different web servers implement different techniques for handling incoming HTTP requests in parallel. A pretty popular technique is using threads -- that is, the web server will create/dedicate a single thread for each incoming request. The Apache HTTP web server supports multiple models for handling requests, one of which (called the worker MPM) uses threads. But it supports another concurrency model called the prefork MPM which uses processes -- that is, the web server will create/dedicate a single process for each request.
There are also other completely different concurrency models (using Asynchronous sockets and I/O), as well as ones that mix two or even three models together. For the purpose of answering this question, we are only concerned with the two models above, and taking Apache HTTP server as an example.
PHP itself does not respond to the actual HTTP requests -- this is the job of the web server. So we configure the web server to forward requests to PHP for processing, then receive the result and send it back to the user. There are multiple ways to chain the web server with PHP. For Apache HTTP Server, the most popular is "mod_php". This module is actually PHP itself, but compiled as a module for the web server, and so it gets loaded right inside it.
There are other methods for chaining PHP with Apache and other web servers, but mod_php is the most popular one and will also serve for answering your question.
You may not have needed to understand these details before, because hosting companies and GNU/Linux distros come with everything prepared for us.
Since with mod_php, PHP gets loaded right into Apache, if Apache is going to handle concurrency using its Worker MPM (that is, using Threads) then PHP must be able to operate within this same multi-threaded environment -- meaning, PHP has to be thread-safe to be able to play ball correctly with Apache!
At this point, you should be thinking "OK, so if I'm using a multi-threaded web server and I'm going to embed PHP right into it, then I must use the thread-safe version of PHP". And this would be correct thinking. However, as it happens, PHP's thread-safety is highly disputed. It's a use-if-you-really-really-know-what-you-are-doing ground.
In case you are wondering, my personal advice would be to not use PHP in a multi-threaded environment if you have the choice!
Speaking only of Unix-based environments, I'd say that fortunately, you only have to think of this if you are going to use PHP with Apache web server, in which case you are advised to go with the prefork MPM of Apache (which doesn't use threads, and therefore, PHP thread-safety doesn't matter) and all GNU/Linux distributions that I know of will take that decision for you when you are installing Apache + PHP through their package system, without even prompting you for a choice. If you are going to use other webservers such as nginx or lighttpd, you won't have the option to embed PHP into them anyway. You will be looking at using FastCGI or something equal which works in a different model where PHP is totally outside of the web server with multiple PHP processes used for answering requests through e.g. FastCGI. For such cases, thread-safety also doesn't matter. To see which version your website is using put a file containing <?php phpinfo(); ?>
on your site and look for the Server API
entry. This could say something like CGI/FastCGI
or Apache 2.0 Handler
.
If you also look at the command-line version of PHP -- thread safety does not matter.
Finally, if thread-safety doesn't matter so which version should you use -- the thread-safe or the non-thread-safe? Frankly, I don't have a scientific answer! But I'd guess that the non-thread-safe version is faster and/or less buggy, or otherwise they would have just offered the thread-safe version and not bothered to give us the choice!
For me, I always choose non-thread safe version because I always use nginx, or run PHP from the command line.
The non-thread safe version should be used if you install PHP as a CGI binary, command line interface or other environment where only a single thread is used.
A thread-safe version should be used if you install PHP as an Apache module in a worker MPM (multi-processing model) or other environment where multiple PHP threads run concurrently - simply put, any CGI/FastCGI build of PHP does not require thread safety.
Apache MPM prefork with modphp is used because it is easy to configure/install. Performance-wise it is fairly inefficient. My preferred way to do the stack, FastCGI/PHP-FPM. That way you can use the much faster MPM Worker. The whole PHP remains non-threaded, but Apache serves threaded (like it should).
So basically, from bottom to top
Linux
Apache + MPM Worker + ModFastCGI (NOT FCGI) |(or)| Cherokee |(or)| Nginx
PHP-FPM + APC
ModFCGI does not correctly support PHP-FPM, or any external FastCGI applications. It only supports non-process managed FastCGI scripts. PHP-FPM is the PHP FastCGI process manager.
As per PHP Documentation,
Thread Safety means that binary can work in a multithreaded webserver context, such as Apache 2 on Windows. Thread Safety works by creating a local storage copy in each thread, so that the data won't collide with another thread.
So what do I choose? If you choose to run PHP as a CGI binary, then you won't need thread safety, because the binary is invoked at each request. For multithreaded webservers, such as IIS5 and IIS6, you should use the threaded version of PHP.
Following Libraries are not thread safe. They are not recommended for use in a multi-threaded environment.
The other answers address SAPIs implementations, and while this is relevant the question asks the difference between the thread-safe vs non thread-safe distributions.
First, PHP is compiled as an embeddable library, such as libphp.so on *NIX and php.dll on Windows. This library can be embedded into any C/CPP application, but obviously it is primarily used on web servers. At it's core PHP starts up in in two major phases, the module init phase and then request init phase. Module init initializes the PHP core and all extensions, where request init initializes PHP userspace - both native userspace features as well as PHP code itself.
The PHP library is setup to where the module phase only has to be called on once, but the request phase has to be reinitialized for each HTTP request. Note that CLI links to the same library as mod_php ect, and still has to go through these phases internally even though it may not be used in the context of processing HTTP requests. Also, it's important to note that PHP isn't literally designed for processing HTTP requests - most accurately, it is designed for processing CGI events. Again, this isn't just php-cgi, but all SAPI/applications including php-fpm, mod_php, CLI and even the exceedingly rare PHP desktop application.
Webservers (or more typically SAPIs) that link to libphp tend to follow one of four patterns:
Note that in examples 2 and 3 the child process is typically terminated after each request. In example 2 the child process must be terminated at the end of each request.
The forth example is related to threaded implementations
In the threaded case, request handling threads tend to utilize a thread pool, with each thread running in a loop initializing the request phase at the beginning and than destroying the request phase at the end which is more optimal than spawning a new thread per request
Regardless of how threaded implementations utilize libphp, if the module phase is initialized in one thread and request phases are called in different threads (which is the case PHP was designed for) it requires a non-trivial amount of synchronization, not just within the PHP core, but also within all native PHP extensions. Note that this is not just a matter of a “request” at this point, but synchronization that it being called on per PHP OPCODE that relies on any form of resource within the PHP core (or any PHP extension) which exists in a different thread as PHP userspace.
This places a huge demand on synchronization within thread-safe PHP distributions, which is why PHP tends to follow a "share nothing" rule which helps minimize the impact, but there is no such thing as truly "sharing nothing" in this pattern, unless each thread contains a completely separate PHP context, where the module phase and request phase is all done within the same thread per request which is not suggested or supported. If the context built within the module init phase is in a separate thread as the request init phase there will most definitely be sharing between threads. So the best attempt is made to minimize context within the module init phase that must be shared between threads, but this is not easy and in some cases not possible.
This is especially true in more complicated extensions which have their own requirements of how a their own context must be shared between threads, with openssl being a major culprit of of this example which effectually extends outward to any extension that uses it, whether internal such as PHP stream handlers or external such as sockets, curl, database extensions, etc.
If not obvious at this point, thread-safe vs non thread-safe is not just a matter of how PHP works internally as a “request handler” within an SAPI implementation, but a matter of how PHP works internally as an embedded virtual machine for a programming language.
This is all made possible by the TSRM, or the thread safe resource manager, which is well made and handles a very large amount of synchronization with little perceived overhead, but the overhead is definitely there and will grow not just based on how many requests per second that the server must handle (the deciding factor on how may threads the SAPI requires), but also by how much PHP code is used per request (or per execution). In other words, large bloated frameworks can make a real difference when it comes specifically to TSRM overhead. This isn't to speak of overall PHP performance and resource requirements within thread-safe PHP, but just the additional overhead of TSRM itself within thread-safe PHP.
As such, pre compiled PHP is distributed in two flavors, one built where TSRM is active in libphp (thread-safe) and one where libphp does not use any TSRM features (non thread-safe) and thus does not have the overhead of TSRM.
Also note that the flag used to compile PHP with TSRM (--enable-maintainer-zts or --with-zts in later PHP versions) causes phpize to extend this outward into the compilation of extensions and how they initialize their own libraries (libssl, libzip, libcurl, etc) which will often have their own way of compiling for thread-safe vs non thread-safe implementations, i.e their own synchronization mechanisms outside of TSRM and PHP as a whole. While this not exactly PHP related, in the end will still have effect on PHP performance outside of TSRM (meaning on top of TSRM). As such, PHP extensions (and their dependents, as well as external libraries PHP or extensions link to or otherwise depend on) will often have different attributes in thead-safe PHP distributions.