thoughts of a doug

Should Perl Modules Default to Non-Blocking?

Posted on 20 October 2011, 13:10, by Douglas Wilson, under Perl.

I have been thinking about possibly changing my modules and for the creation of new modules in the future to be non-blocking (like utilizing AnyEvent or something else generic). Typically these days my main work involves node.js, which emphases non-blocking (i.m.o). Then a post came up recently that made me think about this again (yes, I know in this particular case non-blocking still would not have solved the issue here, which was, apparently, that the person did not want to use event-based programming, but it just reminded me on my thoughts on non-blocking).

The main thing with non-blocking is that if you want to use something that is blocking, it is harder to use that in a non-blocking environment unless you can shove that blocking behavior into a background thread and put an event wrapper around it. Also, a non-blocking module can always be used in a blocking way, so I think this makes it a win-win.

Note: By “default to non-blocking” I mean that they can behave in a non-blocking way, so the module can behave both blocking and non-blocking.

Tags: AnyEvent, cpan, Perl, Perl5
Comment (RSS) | Trackback

6 Comments

Gabor Szabo says:

21 October 2011 at 1:41

I admit I don’t understand what does blocking and non-blocking mean in the general case. If I have a math related module with a function to sum() numbers. How could be that non-blocking? What about a sub that will take a long time to calculate something?

Reply to this comment
- Douglas Wilson says:
  
  21 October 2011 at 11:34
  
  Well, “blocking” refers to your thread of execution needing to wait for work you are not performing. So in the example of sum()ing numbers, non-blocking doesn’t apply since it is actually doing work, it is not “blocking”. Typically things that “block” are I/O operations. If you want to make a network request, for example, as soon as your program sends the packet data to the network stack, the time your program is sitting there doing nothing but waiting for incoming data is when your program is “blocked” (the socket is said to be “blocking” your program). You could be summing those numbers while you are waiting for a response from the network 🙂 I hope this makes sense. You can also find further information at https://secure.wikimedia.org/wikipedia/en/wiki/Non-blocking_I/O
  
  Reply to this comment
Max says:

21 October 2011 at 2:52

We have many great Net::* modules on CPAN, but a little handle non-blocking situations. It’s a pity when you design a big event based application. No need to rely on AnyEvent:: or POE:: port of them…

YES! YES! YES! We need non blocking modules!

If Net::* modules handled an “async” mode, they would be embeddable in any event based application, via AnyEvent::, POE::, IO::Async:: or whatever…

At the conception time, it is not a big job, in fact, they just have to expose their main filehandle and propose callbacks to be called on different events. The original blocking API can remain unchanged and they would become event loop agnostic.

The main problem comes from the interface to bad designed C libraries, like often database ones.

I dream of an event based DBI module! In this case, I won’t need to fork to handle a simple database connection…

Reply to this comment
- Douglas Wilson says:
  
  21 October 2011 at 11:53
  
  In fact, DBD::mysql now exposes a file handle so you can make it non-blocking with your favorite event system as of version 4.019: http://blogs.perl.org/users/hoelzro/2011/10/asynchronous-mysql-queries-in-perl-using-dbdmysql-and-anyevent.html
  
  Reply to this comment
- Tom Molesworth says:
  
  21 October 2011 at 13:44
  
  Although providing access to watch for read/write-ready on the underlying IO layer state is a good first step, personally I don’t find the DBI API to be particularly convenient for async/event-based database handling. Methods such as ->fetch_row can be implemented using condvars or similar, and I believe there are some POE components that wrap DBI but still give the familiar ->prepare, ->bind, ->execute, ->fetch methods, but if the rest of the codebase is purely event-driven through callbacks then the db code starts to look a bit out of place.
  
  So far my workaround has been to implement an abstraction layer for the wire protocol (trivial enough for pg/mysql since their protocols are very well documented), for cases where query runtime is significant enough that the application would otherwise spend more time waiting for the database than doing useful work – see Net::Async::PostgreSQL for example.
  
  Of course, this is just the low-level database interaction layer – I wouldn’t like to write an entire application using just DBI when there are excellent modules like DBIx::Class around to handle the higher-level pieces, for example – but unfortunately there don’t seem to be any good event-based ORM modules around yet. Again, I resorted to writing my own in this case (‘EntityModel’ on CPAN, although I wouldn’t suggest using the existing CPAN version in a production environment) but it’d be interesting to see if async features start making their way into DBIx::Class and related modules, and how well they work against the existing DBI API.
  
  I think the temptation when writing a new module for handling a network protocol is to start with something along the lines of IO::Socket->connect(…), rather than separating out the protocol and transport layers (as per Paul’s comment below). This has the advantage of getting things up-and-running a lot quicker, but does tend to tie it all a bit too tightly to the transport layer.
  
  Personally I find testing to be a lot easier if you can run predefined strings through the protocol part of the code rather than having to run against a real server – and implementing both client and server parts is often easier if you have that protocol layer in place. Plus this way you can do things like proxies, protocol-over-some-other-protocol layering, and also makes it easier to build tools such as debuggers which know about the packet structure.
  
  Reply to this comment
Paul "LeoNerd" Evans says:

21 October 2011 at 12:29

My usual approach is to either write something truely low-level and abstract that manipulates byte buffers (see e.g. Protocol::WebSocket, or Tangence), or to write something that can be used in a blocking or non-blocking fashion (e.g. see Term::TermKey). Having done that, it’s fairly simple to build the entire trio of AnyEvent/POE/IO::Async wrapper modules – see AnyEvent::TermKey, POE::Wheel::TermKey, Term::TermKey::Async.

In fact, this very subject will be the subject of my main talk at LPW2011 this year, presuming my talk proposal is accepted.

Reply to this comment

Should Perl Modules Default to Non-Blocking?

6 Comments

Leave a Reply to Gabor Szabo

Archives

Categories

Meta