Ruby 2.3.0 changes and features

Ruby 2.3.0-preview1 has been released. Let’s see what new features we’re getting this time:

Table of content

Frozen string literals

If you like reviewing gems, you could often find something like that:

module MyGem
  VERSION = '1.0.2'.freeze

Did you ever wonder why people tend to freeze their strings? There are 2 benefits of doing so:

  1. First of all, you tell other programmers that this string should not change – never. It should remain as it is and it is intended (in general this is why freeze exists)
  2. Performance is a factor as well. Frozen strings won’t be changed in the whole application, so Ruby can take advantage of it and reduce object allocation and memory usage

It’s worth pointing out, that Ruby 3.0 might consider all strings immutable by default (1, 2). That means that you can use current version of this feature as a first step to prepare yourself for that.

How can you start using “frozen by default” strings without having to mark all of them with #freeze? For now, you will have to add following magic comment at the beginning of each Ruby file:

# frozen_string_literal: true

# The comment above will make all strings in a current file frozen

Luckily, as expected from Ruby community, there’s already a gem that will add the magic comment to all the files from your Ruby projects: magic_frozen_string_literal.

Ok, so we will lose possibility to change strings (if we won’t explicitly allow that), what we will get in return? Performance. Here are the performance results of creating 10 000 000 strings all over again (with GC disabled). In my opinion results are astonishing (you can find the benchmark here). With frozen strings a simple “string based” script runs 44% faster then the version without freezing.


If you wonder what will happen, if you try to modify frozen string, here’s a simple example you can run:

# frozen_string_literal: true
'string' << 'new part'

# output:
frozen.rb:3:in `<main>': can't modify frozen String (RuntimeError)

This description is pretty straightforward for smaller applications, but if you deal with many strings, from many places, you might have a bit of trouble finding an exact place where this particular string was created. That’s when –enable-frozen-string-literal-debug flag for Ruby becomes handy:

# frozen_string_literal: true
'string' << 'new part'

# Execute with --enable-frozen-string-literal-debug flag
# ruby --enable-frozen-string-literal-debug script.rb
# output:
frozen.rb:3:in `<main>': can't modify frozen String, created at frozen.rb:3 (RuntimeError)

it will tell you not only the place where you tried to modify a frozen string, but also a place where this string has been created.

Immutable strings make us also one step closer to understanding, introducing and using the concept of immutable objects in Ruby.

Safe navigation operator

Easy nil saving FTW! Finally a ready to go replacement for many #try use cases (however not for all of them). Small yet really sufficient feature, especially for non-Rails people. So what exactly Object#try was doing?

Object#try – Invokes the public method whose name goes as first argument just like public_send does, except that if the receiver does not respond to it the call returns nil rather than raising an exception.

This is how you can use it:

# Ruby 2.2.3
user = User.first

if user && user.profile
  puts "User: #{user.profile.nick}"

# Ruby 2.3.0
user = User.first

if user&.profile
  puts "User: #{user.profile.nick}"

At first that might not look too helpful but image a chain of checks similar to this one:

if user&.profile&.settings&.deliver?
# and so on...

Warning: It is worth pointing out that the Object#try from ActiveSupport and the safe navigator operation differs a bit in their behaviour so you need to closely consider each case in which you want to replace one with another (both ways). When we have a nil object on which we want to invoke the method (both via try or safe navigator), they behave the same:

user = User.first # no users - user contain nil
user.nil? #=> true

user.try(:name) #=> nil
user&.name #=> nil

However, their behaviour strongly differs when we have non-nil objects that may (or may not) respond to a given method:

class User
  def name

class Student < User
  def surname

user =

# This wont fail - even when user exists but does not respond to a #surname method
if user.try(:surname)
  puts user.surname

user =

if user.try(:surname)
  puts user.surname

# With an object that has a #surname method the safe
# navigation operation will work exactly the same as try
if user&.surname
  puts user.surname

# However it will fail if object exists but does not have a #surname method

user =

if user&.surname
  puts user.surname

NoMethodError: undefined method `surname' for #<User:0x000000053691d8>

Does this difference really matter? I strongly believe so! #try gives you way more freedom in terms of what you can do with it. You can have multiple objects of different types and you can just try if they are not nil and if they respond to a given method and based on that react. However it is not the case with the nil safe operator – it requires an object (except nil) to respond to a given method – otherwise it will raise an NoMethodError. This is how they look (more or less) when implemented in Ruby:

# try method simplified
# We skip arguments for simplicity
def try(method_name)
  return nil if nil?
  return nil unless respond_to?(method_name)

def safe_navigator(method_name)
  return nil if nil?
  # Does not care if an object does not respond to a given method

And that’s why, when replacing #try with the safe navigation operator (or the other way around), you need to be extra cautious!

Did you mean

A small helper for debugging process directly built in into Ruby. In general it will help you detect any typos and misspellings that cause your code to fail. Will it become handy? Well I tell you once I’ll use it for a while.

class User
  def name

user =

# Error you will receive:
user.rb:7:in `<main>': undefined local variable or method `use' for main:Object (NameError)
Did you mean?  user

Hash Comparison

This is a really nice feature. Comparing hashes like numbers. Up until now we had the #include? method on a Hash, however it could only check if a given key is included (we could not check if one hash contains a different one):

{ a: 1, b: 2 }.include?(:a) #=> true
{ a: 1, b: 2 }.include?(a: 1) #=> false

Now you can check it using comparisons:

{ a: 1, b: 2 } >= { a: 1 } #=> true
{ a: 1, b: 2 } >= { a: 2 } #=> false

That way can compare not only keys but also their values. Here are some examples that should help you understand how it goes:

Same keys and content

{ a: 1 } <= { a: 1 } #= true
{ a: 1 } >= { a: 1 } #= true
{ a: 1 } == { a: 1 } #= true
{ a: 1 } >  { a: 1 } #= false
{ a: 1 } <  { a: 1 } #= false

Same keys, different content

{ a: 1 } <= { a: 2 } #= false
{ a: 1 } >= { a: 2 } #= false
{ a: 1 } == { a: 2 } #= false
{ a: 1 } >  { a: 2 } #= false
{ a: 1 } <  { a: 2 } #= false

Same content, different keys

{ a: 1 } <= { b: 1 } #= false
{ a: 1 } >= { b: 1 } #= false
{ a: 1 } == { b: 1 } #= false
{ a: 1 } >  { b: 1 } #= false
{ a: 1 } <  { b: 1 } #= false

Hash containing different one (with same values)

{ a: 1, b: 2 } <= { a: 1 } #= false
{ a: 1, b: 2 } >= { a: 1 } #= true
{ a: 1, b: 2 } == { a: 1 } #= false
{ a: 1, b: 2 } >  { a: 1 } #= true
{ a: 1, b: 2 } <  { a: 1 } #= false

Hash containing different one (with different values)

{ a: 1, b: 2 } <= { a: 3 } #= false
{ a: 1, b: 2 } >= { a: 3 } #= false
{ a: 1, b: 2 } == { a: 3 } #= false
{ a: 1, b: 2 } >  { a: 3 } #= false
{ a: 1, b: 2 } <  { a: 3 } #= false

Number #negative? and #positive?

A simple example should be enough to describe this feature:

elements = (-10..10) #=> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] #=> [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1]

Hash#dig and Array#dig

Ruby was always about doing things in a short and easy way. Now this can be also said about retrieving deeply nested data from hashes and arrays thanks to #dig. No more || {} tricks or anything like that!

# Nested hashes
settings = {
  api: {
    adwords: {
      url: 'url'

settings.dig(:api, :adwords, :url) # => 'url'
settings.dig(:api, :adwords, :key) # => nil

# Way better than this:
((settings[:api] || {})[:adwords] || {})[:url]
# Nested arrays
nested_data = [
    %w( a b c )

nested_data.dig(0, 0, 0) # => 'a'
nested_data.dig(0, 1, 0) # => nil

# Way better than this:
((nested_data[0] || [])[0] || [])[0]


This feature allows to use hashes to iterate over enumerables

h = { a: 1, b: 2, c: 3 }
k = %i{ a b c d }


More strict version of Hash#values_at:

settings = {
  name: 'app_name',
  version: '1.0'

settings.values_at(:name, :key) #=> ['app_name', nil]
settings.fetch_values(:name, :key) #=> exception will be raised
KeyError: key not found: :key


I could use this one few times. Instead of negating a regexp itself, we can now just specify that we want all the elements that does not match a given regexp. It also works for filtering by types.

# Regexp example
data = %w(

regexp = /.+\@.+\..+/

data.grep(regexp) #=> ['']
data.grep_v(regexp) #=> ['notanemail', 'thisisnotanemailaswell@']
# Type matching example
data = [nil, [], {}]

data.grep(Array) #=> [[]]
data.grep_v(Array) #=> [nil, {}]


It’s been a while since we had a new Ruby version with some additional features. It’s not a big step but it definitely brings something to the table. Nothing that we would not be able to do with an old plain Ruby, but now many things will require much less work. If we consider this release as a transitional one, it feels that things are heading in a good direction (towards 3.0).

Nov 2015
POSTED IN Ruby Software

Benchmarking Karafka – how does it handle multiple TCP connections

Recently I’ve released a Ruby Apache Kafka microframework, however I don’t expect anyone to use it without at least a bit information on what it can do. Here are some measurements that I took.

How Karafka handles multiple TCP connections

Since listening to multiple topics require multiple TCP connections it is pretty obvious that in order to obtain a decent performance, we are using threads (process clustering feature is in progress). Each controller that you create theoretically could have a single thread and could listen all the time. However with a bigger application, it could slow down the application. That’s why we introduced topics clusterization. When you config your Karafka application, you should specify the concurrency parameter:

class App < Karafka::App
  setup do |config|
    # Other config options
    config.concurrency = 10 # 10 threads max

This is a maximum number of threads that will be used to listen for incoming messages. It is pretty simple when you have less controllers (topics) than threads – it will just use a single thread per topic. However if you have more controllers then threads – few connections will be packed in a single thread (wrapped with Karafka::Connection::ThreadCluster). And this is how it works when you have 2 threads and 4 controllers:


In general, it will distribute TCP connections across threads evenly. So, if you have 20 controllers and 5 threads, each thread will be responsible for checking 4 sockets, one after another. Since it won’t do this simultaneously, Karafka will slow down. How much? It depends – if there’s something on each of the topics – you will get around 24% (per controller) of the base performance out of each connection.

Other things that have impact on the performance

When considering this framework’s performance, you need to keep in mind that:

  • It is strongly dependent on what you do in your code
  • It depends also on Apache Kafka performance
  • Connection between Karafka and Redis (for Sidekiq) is a factor as well
  • All the benchmarks show the performance without any business logic
  • All the benchmarks show the performance without enqueuing to Sidekiq
  • It also depends on what type of infrastructure you benchmark everything
  • Message size is a factor as well (since it get deserialized to JSON by default)
  • Ruby version – I’ve been testing in on MRI (CRuby) 2.2.3 – Karafka is not yet working with other Ruby distributions (JRuby or Rubinius) but it should change when some of the dependencies stop using refinements



For each of the benchmarks I was measuring time taken to consume all messages that were stored in Kafka. There were no business logic involved (just messages processing by the framework). My local Kafka setup was a default setup (no settings were changed) introduced with this Docker containers.

I’ve tested up to 5 topics – each with 1 000 000 messages loaded. Since Karafka has lazy loading for params – benchmark does not include time that is needed to unparse the messages. Unparsing performance strongly depends on a parser you pick (defaults to JSON) and messages size. Those benchmarks measure maximum throughput that we can get during messaging receiving.

Note: all the benchmarking was performed on my 16GB, 4 core i7 processor, Linux laptop. During the benchmarking I’ve been performing other tasks that might have small impact on overall results (although  no heavy stuff).

1 thread

With a single thread it is pretty straightforward – the more controllers we have, the less we can process per controller. There’s also controllers context switching overhead that consumes some of the power, allowing us to consume less and less. Switching between controllers seems to consume around 11% of a single controller performance when we tend to use more than 1 controller in a single threaded application.

Zrzut ekranu z 2015-11-02 17:50:46
Context switching between controllers in a single thread will cost us around 1% of a general performance per one additional controller (if you’re eager to know what we’re planning to do with it scroll down to the summary). On one side it is a lot, on the other, with a bigger application you should probably run Karafka in multithreaded mode.. That way context switching won’t be as painful.

2 threads

Zrzut ekranu z 2015-11-02 18:12:37
General performance with 2 threads and 2 controllers proves that we’re able to lower switching impact on a overall performance, gaining around 1.5-2k requests per second (overall).

3 threads

Zrzut ekranu z 2015-11-02 18:23:13
5 controllers with 3 threads vs 5 controllers with 1 thread: 7% better performance.

4 threads

Zrzut ekranu z 2015-11-02 18:32:40

5 threads

Zrzut ekranu z 2015-11-02 18:33:33

Benchmarks results


The overall performance of a single Karafka framework process is highly dependent on the way it is being used. Because of GIL, when we receive data from sockets, we can only process incoming messages from a single socket at a time. So in general we’re limited to around 30-33k requests per second per process. It means that the bigger the application gets, the slower it works (when we consider total performance per single controller). However this is only valid when we assume that all the topics are always full of messages. Otherwise we don’t process, we wait on the IO and Ruby can process incoming messages from multiple threads. That’s why it is worth starting Karafka with a decent concurrency level.

How can we increase throughput for Karafka applications? Well for now, we can create multiple partitions for a single topic and spin up multiple Karafka processes. Then they will load balance between partitions automatically. This solution has one downside: if we have only few topics with multiple partitions and rest with a single one, then some of the threads in Karafka won’t perform any work. This will be fixed soon (we’re already working on it), when we will introduce a Karafka processes clustering. It will allow to spin up multiple Karafka processes (in a single cluster) that will listen only for a given part of controllers. That way the overall performance will increase significantly. But still being able to perform 30k rq/s is not that bad, right? ;)