Ruby: Karafka framework 0.4 – Routing engine

13313336

We’ve finally released a new version of Karafka framework. Apart from many tweaks and bug fixes it contains three huge improvements:

  • ApplicationWorker and ApplicationController
  • Routing engine
  • Offset commit after data processing (not only after fetching)

In this article I will focus on the routing engine.

Pre-routing times

First version of Karafka framework routed incoming messages based on controllers names. For example:

  • videos topic => VideosController
  • movies_subtitles => Movies::SubtitlesController

There was also a possibility to change that by setting the topic property of a controller:

class SuperController < Karafka::BaseController
  self.topic = :my_data
end

Unfortunately it turned out that this approach had way more disadvantages than advantages:

  • Fat, non-isolated routing logic bound tightly to controllers
  • No possibility to use inheritance
  • No possibility to use same controller logic for multiple routes (at least not directly)
  • A lot non-controller related logic that had to be there because of consistency of the framework DSL (interchanger, group, etc)

Luckily those times are gone and now we have a nice shiny routing engine.

Note: If you’re not familiar with terms like topic, producer, etc – please refer to an excellent post about that: Kafka for Rubyists – Part 1

Karafka routing basics

Karafka has a “Rails like” routing that allows you to describe how messages from each topic should be handled. Such separation layer (between topics, controllers and workers) allows you to have better control over the message “flow”.

App.routes.draw do
  topic :example do
    controller ExampleController
  end
end

By default Karafka requires topic and controller. Note that controllers should be provided as class names, not as strings/symbols (like in Rails). For most of the cases this will be all you need. Everything else will be built automatically under the hood.

Kafka has a flat topic structure. There are no namespaces or nested topics (however it might change in the future, since there’s a discussion about that). It means that your Karafka routes will be flat as well: single topic block and within all the details on how to handle messages from it.

There are following options that you can use inside a topic block:

  • Controller (required)
  • Group (optional)
  • Worker (optional)
  • Parser (optional)
  • Interchanger (optional)

Here’s an example on how to use all the options together

App.routes.draw do
  topic :binary_video_details do
    group :composed_application
    controller Videos::DetailsController
    worker Workers::DetailsWorker
    parser Parsers::BinaryToJson
    interchanger Interchangers::Binary
  end

Karafka controller

Karafka controller is similar to Rails controller. It is responsible for handling each of incoming messages.

Karafka group

Taken from Apache Kafka documentation:

Consumers label themselves with a consumer group name, and each message published to a topic is delivered to one consumer instance within each subscribing consumer group. Consumer instances can be in separate processes or on separate machines.

If you’re not planning to build multiple applications (not processes) from which only one needs to consume a given message, don’t worry about that option at all. Karafka will take care of that for you.

Karafka worker

Karafka by default will build a worker that will correspond to each of your controllers (so you will have a pair – controller and a worker). All of them will inherit from ApplicationWorker and will share all its settings.

You can overwrite this by providing your own worker. Just keep in mind, that if it does not inherit from ApplicationWorker it won’t have interchanging, unparsing logic and it won’t execute your controller #perform code. In cases like that you’ll have to implement your business logic inside the worker.

Karafka parser

Kafka is a cross-platform tool. It means that when integrating with other applications, you might not receive JSON data (producer dependent). Custom parsers allow you to handle any data format you specify. Karafka by default will parse messages with JSON parser. Custom parser needs to have a #parse method and raise error that is a ::Karafka::Errors::ParserError descendant when problem appears during parsing process. Parsing failure won’t stop the application flow. Instead, Karafka will assign the raw message inside the :message key of params. That way you can handle raw message inside the Sidekiq worker (you can implement error detection, etc – any “heavy” parsing logic can and should be implemented there).

Karafka interchanger

Custom interchangers target issues with non-default (binary, etc) data that we want to store when we do #perform_async. This data might be corrupted when fetched in a worker (see this issue). With custom interchangers, you can encode/compress data before it is being passed to scheduling and decode/decompress it when it gets into the worker.

Note: It is not equal to parsers. Parsers parse incoming data and convert them to “Ruby friendly” format. Interchangers allow you to add an additional layer of encoding, so for example your binary data won’t be escaped by Sidekiq.

Warning: if you decide to use slow interchangers, they might significantly slow down Karafka.

Summary

Karafka’s routing engine might seems simple, however it is simple only until you decide to use all the non-required features.

Thanks to this approach you can prototype and design your application flow really fast, however on demand you can also handle any non-standard case.

Is the current routing engine a final one? To be honest it is strongly dependent on Apache Kafka’s development. We will see in the future :-)

31
Jan 2016
POSTED BY
POSTED IN Ruby Software
DISCUSSION 0 Comments

Upgrading to Ruby on Rails 5.0 from Rails 4.2 – application use case

In this article, I’ll try to cover all the issues that I had, when I was upgrading one of my Rails app from Ruby on Rails 4.2 to Ruby on Rails 5.0.

Note: This is not a complete, covering every possible case tutorial. You might encounter issues that I didn’t. If so, feel free to comment and I will try to update the post with other issues and solutions.

Preparations – What should we do before upgrading to Rails 5?

  • Upgrade your Ruby version at least to 2.2.2 (I would recommend 2.3, since it is going to be released soon)
  • Upgrade bundler
  • Upgrade your application to the most recent Rails 4.2 version
  • Check gems compatibility (you may want to be on edge with few gems (or use Rails 5 branches if available))
  • Write more tests if you don’t have a decent code coverage

The last point is the most important. If you don’t have a good code coverage level and you lack tests, upgrading from Rails 4.2 to Rails 5 might be a big problem.

Upgrading to Rails 5 – Gemfile

Currently there are some gems incompatibilities (I will cover them later) and with some, you need to go edge. Here’s a list of gems that I had to switch to edge (or use a specific version) in order to get things running:

gem 'rails', '5.0.0.beta1'
gem 'devise', github: 'plataformatec/devise'
gem 'responders', github: 'plataformatec/responders'
gem 'ransack', github: 'activerecord-hackery/ransack'
gem 'kaminari', github: 'amatsuda/kaminari'
gem 'mailboxer', github: 'mailboxer/mailboxer'

group :test do
  gem 'rspec', github: 'rspec/rspec'
  gem 'rspec-mocks', github: 'rspec/rspec-mocks'
  gem 'rspec-expectations', github: 'rspec/rspec-expectations'
  gem 'rspec-support', github: 'rspec/rspec-support'
  gem 'rspec-core', github: 'rspec/rspec-core'
  gem 'rspec-rails', github: 'rspec/rspec-rails', branch: 'rails-5-support-patches'
  gem 'rails-controller-testing'
end

Note that I’ve added a rails-controller-testing gem.This gem brings back assigns to your controller tests as well as assert_template to both controller and integration tests. If you use those methods, after upgrading to Ruby on Rails 5, you will have to add it.

After updating your Gemfile you can do a

bundle install

and hopefully you’re ready for the upgrade!

Configuration files – rake rails:update

There are some changes in configuration files (that I will cover), but I would strongly recommend running:

rake rails:update

allow it to overwrite your files and then just add the stuff that was gone (git diff). IMHO it is way less complex approach, that trying to add all new config options manually.

In my case, following files were affected:

On branch rails5
Changes not staged for commit:
    modified:   bin/rails
    modified:   bin/rake
    modified:   config/application.rb
    modified:   config/boot.rb
    modified:   config/environments/development.rb
    modified:   config/environments/production.rb
    modified:   config/environments/test.rb
    modified:   config/initializers/assets.rb
    modified:   config/initializers/cookies_serializer.rb
    modified:   config/initializers/session_store.rb
    modified:   config/initializers/wrap_parameters.rb

Untracked files:
    bin/setup
    bin/update
    config/initializers/application_controller_renderer.rb
    config/initializers/backtrace_silencers.rb
    config/initializers/cors.rb
    config/initializers/inflections.rb
    config/initializers/mime_types.rb
    config/initializers/request_forgery_protection.rb
    config/redis/

However not all the changes are worth being mentioned (sometimes there were just comments changes).

bin/rails and bin/rake

If you didn’t have any custom stuff there, just go with the flow ;)

config/application.rb and config/boot.rb

I’ll just quote the comment from application.rb (although it is not a new feature it is still worth reminding):

Settings in config/environments/* take precedence over those specified here. Application configuration should go into files in config/initializers all .rb files in that directory are automatically loaded.

And this is exactly what you should do. If you have any custom stuff in application.rb – just move it to initializers. And what about boot.rb? Leave it as it is.

config/environments/*.rb

development.rb – feature toggling and file watcher

There’s something quite interesting in new development.rb: feature toggling. Thanks to some files that are in tmp/ you can disable/enable some features to test them in the dev mode, without having to switch to production mode (for example caching).

Rails.root.join('tmp/caching-dev.txt').exist?

There is also (commented) file watcher added. You can use it to have an auto-reloading based on your file changes:

# Use an evented file watcher to asynchronously detect changes in source code,
# routes, locales, etc. This feature depends on the listen gem.
# config.file_watcher = ActiveSupport::EventedFileUpdateChecker
test.rb – random order

Tests now will run in a random order. If your tests rely on their execution order, they it is time to rethink them.

config.active_support.test_order = :random
production.rb – no more serve_static_files and Action Cable

Rack cache is out (config.action_dispatch.rack_cache). The config.serve_static_files = false flag is being replaced by:

config.public_file_server.enabled = ENV['RAILS_SERVE_STATIC_FILES'].present?

There are also few configuration options for Action Cable available:

# Action Cable endpoint configuration
config.action_cable.url = 'wss://example.com/cable'
config.action_cable.allowed_request_origins = [ 'http://example.com', /http:\/\/example.*/ ]

config/initializers/*.rb

application_controller_renderer.rb

New initializer configuration for application controller renderer.  This is a really good feature that finally allows to drop some “bypass like” solutions. If you didn’t render outside of controllers scope, you can leave this commented.

ApplicationController.renderer.defaults.merge!(
http_host: 'example.org',
https: false
)

cors.rb

Here are all the settings related to CORS. Handle Cross-Origin Resource Sharing (CORS) in order to accept cross-origin AJAX requests. If you don’t know what that means, it means that you can remove this file.

config/redis/cable.yml

Redis settings for ActionCable. Nothing special except one thing: why do we have config/database.yml and config/redis/cable.yml instead of config/databases/redis.yml and config/databases/mysql.yml (or something similar)?

Starting your freshly updated Rails 5 application

As I said before, if you have a decent code coverage level and you know what you’re doing, upgrade should not be a big problem.

At this point, if you’re not using any fancy route settings or any other magic, you should be able to at least start your application:

rails s

Puma 2.15.3 starting...
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://localhost:3000

However don’t expect it to work correctly… yet ;)

Monkey patching gems

We’re just before Christmas, so I don’t expect gem maintainers to fix them. That’s why we will fix some of them on our own. Here’s a list of gems with some incompatibilities (even when using edge) that I’ve encountered (not including Rails gems):

Kaminari

If you see this error:

Generating an URL from non sanitized request parameters is insecure!

Just use the master branch source code from Github:

gem 'kaminari', github: 'amatsuda/kaminari'

ActiveRecord Import – undefined method type_cast

/activerecord-import/import.rb:479→ block (2 levels) in values_sql_for_columns_and_attributes
/activerecord-import/import.rb:469→ each
/activerecord-import/import.rb:469→ each_with_index
/activerecord-import/import.rb:469→ each
/activerecord-import/import.rb:469→ map
/activerecord-import/import.rb:469→ block in values_sql_for_columns_and_attributes
/activerecord-import/import.rb:468→ map
/activerecord-import/import.rb:468→ values_sql_for_columns_and_attributes
/activerecord-import/import.rb:395→ import_without_validations_or_callbacks
/activerecord-import/import.rb:362→ import_with_validations
/activerecord-import/import.rb:304→ import_helper
/activerecord-import/import.rb:246→ import

To fix this, create an activerecord_import_patch.rb file in initializers and place this code:

class ActiveRecord::Base
  class << self
    def values_sql_for_columns_and_attributes(columns, array_of_attributes)   # :nodoc:
      connection_memo = connection
      array_of_attributes.map do |arr|
        my_values = arr.each_with_index.map do |val,j|
          column = columns[j]

          if val.nil? && column.name == primary_key && !sequence_name.blank?
             connection_memo.next_value_for_sequence(sequence_name)
          elsif column
            if column.respond_to?(:type_cast_from_user)
              connection_memo.quote(column.type_cast_from_user(val), column)
            elsif column.respond_to?(:type_cast)
              connection_memo.quote(column.type_cast(val), column)
            else
              connection_memo.quote(val)
            end
          end
        end
        "(#{my_values.join(',')})"
      end
    end
  end
end

ActsAsTaggableOn – Invalid Rails version detection

If you’ll try using ActsAsTaggableOn with Rails 5, you’ll end up with this exception

ArgumentError (Unknown key: :order. Valid keys are: :class_name, :anonymous_class,
:foreign_key, :validate, :autosave, :table_name, :before_add, :after_add, :before_remove,
:after_remove, :extend, :primary_key, :dependent, :as, :through, :source, :source_type,
:inverse_of, :counter_cache, :join_table, :foreign_type, :index_errors):
  app/models/concerns/taggable.rb:6:in `block in <module:Taggable>'
  app/models/text.rb:6:in `include'
  app/models/text.rb:6:in `<class:Text>'
  app/models/text.rb:2:in `<top (required)>'
  app/services/portal/main_service.rb:5:in `<class:MainService>'
  app/services/portal/main_service.rb:3:in `<module:Portal>'
  app/services/portal/main_service.rb:1:in `<top (required)>'
  app/controllers/portal/main_controller.rb:5:in `block in <class:MainController>'
  app/controllers/portal/main_controller.rb:6:in `block in <class:MainController>'
  app/controllers/portal/main_controller.rb:13:in `index'
  lib/middlewares/nofollow_anchors.rb:23:in `call'

it’s directly related to this code:

def has_many_with_taggable_compatibility(name, options = {}, &extention)
  if ActsAsTaggableOn::Utils.active_record4?
    scope, opts = build_taggable_scope_and_options(options)
    has_many(name, scope, opts, &extention)
  else
    has_many(name, options, &extention)
  end
end

and as you’ve probably already guested, #active_record4? returns false and it will try to use Rails 3 method call. To fix this, create an acts_as_taggable_on.rb file in config/initializers and place there following code:

module ActsAsTaggableOn::Utils
  def self.active_record4?
    true
  end
end

after a restart, this problem should be solved.

Decent Exposure – methods that no longer exist

This should be considered as a temporary patch. It is not a solid solution (I just wanted to make it work – not to fix decent exposure).

decent_exposure/expose.rb:14:in `block in extended': undefined method `hide_action'
  for ActionController::Base:Class (NoMethodError)
  from /gems/decent_exposure-2.3.2/lib/decent_exposure/expose.rb:7:in `class_eval'
  from /gems/decent_exposure-2.3.2/lib/decent_exposure/expose.rb:7:in `extended'
  from /gems/decent_exposure-2.3.2/lib/decent_exposure.rb:5:in `extend'
  from /gems/decent_exposure-2.3.2/lib/decent_exposure.rb:5:in `block in <top (required)>'
  from /gems/activesupport-5.0.0.beta1/lib/active_support/lazy_load_hooks.rb:38:in `instance_eval'
  from /gems/activesupport-5.0.0.beta1/lib/active_support/lazy_load_hooks.rb:38:in `execute_hook'
  from /gems/activesupport-5.0.0.beta1/lib/active_support/lazy_load_hooks.rb:45:in `block in run_load_hooks'
  from /gems/activesupport-5.0.0.beta1/lib/active_support/lazy_load_hooks.rb:44:in `each'
  from /gems/activesupport-5.0.0.beta1/lib/active_support/lazy_load_hooks.rb:44:in `run_load_hooks'
  from /gems/actionpack-5.0.0.beta1/lib/action_controller/base.rb:262:in `<class:Base>'
  from /gems/actionpack-5.0.0.beta1/lib/action_controller/base.rb:164:in `<module:ActionController>'
  from /gems/actionpack-5.0.0.beta1/lib/action_controller/base.rb:5:in `<top (required)>'
  from /bundler/gems/ransack-2a3759317a44/lib/ransack.rb:38:in `<top (required)>'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:76:in `require'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:76:in `block (2 levels) in require'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:72:in `each'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:72:in `block in require'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:61:in `each'
  from /gems/bundler-1.10.6/lib/bundler/runtime.rb:61:in `require'
  from /gems/bundler-1.10.6/lib/bundler.rb:134:in `require'
  from /home/mencio/Software/Senpuu/config/application.rb:7:in `<top (required)>

Decent Exposure gem uses some methods from ActionController::Base, that apparently have changed (they no longer exist). To fix that, we will just create dummy methods that will just play along with the Decent Exposure internal logic (config/initializers/decent_exposure.rb):

# Patches for Rails 5 and decent exposure
# Put the line below above the require rails line!
# require File.expand_path('../initializers/decent_exposure', __FILE__)
# require 'rails/all'
module DecentExposure
  module Expose
      def hide_action(action)
    end

    def protected_instance_variables
      []
    end
  end
end

If it goes about DecentExposure, we need to make one more change. We need to add a require on top of the application.rb file:

# The first and last line are already there - put the decent_exposure one in the middle
require File.expand_path('../boot', __FILE__)
require File.expand_path('../initializers/decent_exposure', __FILE__)
require 'rails/all'

We had to patch this gem before it is loaded, because on load it already performs some logic that requires non-existing methods.

Code base changes

If you’re not planning to use any new Rails features and just (for now) stay with what you had, you won’t have many things to change.

ApplicationRecord instead of ActiveRecord::Base

It is like an ApplicationController. Just an extra inheritance layer just in case you want to have some global (for all models) elements. Just add application_model.rb and change inheritance in all the models:

# Base AR class for all other
class ApplicationRecord < ActiveRecord::Base
  self.abstract_class = true
end
class Comment < ApplicationRecord
end

Note: this is not a must be. Your app should work even if you don’t change it.

ActionController::Parameters: params are no longer a Hash

Values that are being sent to params are no longer a hash internally. Now they are a XD, so type checking like this:

params[:q].is_a?(Hash)

have to be replaced with:

params[:q].is_a?(ActionController::Parameters)

ActiveRecord::ReadOnlyRecord: Picture is marked as readonly

Note: this is a dirty hack.

Sometimes I load objects with include and join in a default scope:

scope :with_picture, -> { joins(:picture).includes(:picture) }

default_scope lambda {
  self.require_picture? ? with_picture : where(false)
}

up until now I could update/remove the main object and its pictures. However now when I fetch object, its pictures are marked as readonly so I get this exception:

ActiveRecord::ReadOnlyRecord: Picture is marked as readonly

activerecord-5.0.0.beta1/lib/active_record/persistence.rb:532→ create_or_update
activerecord-5.0.0.beta1/lib/active_record/callbacks.rb:298→ block in create_or_update
activesupport-5.0.0.beta1/lib/active_support/callbacks.rb:126→ call
activesupport-5.0.0.beta1/lib/active_support/callbacks.rb:506→ block (2 levels) in compile
activesupport-5.0.0.beta1/lib/active_support/callbacks.rb:455→ call
activesupport-5.0.0.beta1/lib/active_support/callbacks.rb:101→ __run_callbacks__
activesupport-5.0.0.beta1/lib/active_support/callbacks.rb:750→ _run_save_callbacks
activerecord-5.0.0.beta1/lib/active_record/callbacks.rb:298→ create_or_update
activerecord-5.0.0.beta1/lib/active_record/suppressor.rb:41→ create_or_update
activerecord-5.0.0.beta1/lib/active_record/persistence.rb:125→ save
activerecord-5.0.0.beta1/lib/active_record/validations.rb:44→ save
activerecord-5.0.0.beta1/lib/active_record/attribute_methods/dirty.rb:22→ save

I wouldn’t be surprised at all (I know this error) but for some reason it was working with 4.2. So if you want to migrate and deal with it later, add this to your dependent models (those for which this error happens):

before_save do
  instance_variable_set(:@readonly, false)
  true
end

before_destroy do
  instance_variable_set(:@readonly, false)
  true
end

ActiveRecord::Base default_scope lambda change

Note: I think this also applies to standard scopes.

Not sure if it was a bug or a feature – but it is gone. You can no longer return a nil from a lambda default_scope:

# Won't work anymore
default_scope lambda {
  self.require_file? ? with_file : nil
}

if you try using it and it will return nil, you will see:

ArgumentError (invalid argument: nil.):
  app/services/portal/main_service.rb:34:in `recent_texts_pictures'
  app/services/portal/main_service.rb:73:in `max_pictures'
  app/services/portal/main_service.rb:46:in `pictures'
  app/services/portal/main_service.rb:19:in `pictures_page'
  app/controllers/portal/main_controller.rb:6:in `block in <class:MainController>'
  app/controllers/portal/main_controller.rb:13:in `index'
  lib/middlewares/nofollow_anchors.rb:23:in `call'

instead of returning nil, just return where(false):

default_scope lambda {
  self.require_file? ? with_file : where(false)
}

RSpec changes

get/post/put/delete with format also for html

When I don’t provide format: :html, it stopped assuming that is is a html request:

before { get :edit }
before { post :create }
before { put :update }
before { delete :destroy }

examples above fail with following message:

Failure/Error: super(resources, options)
      
  ActionController::UnknownFormat:
    ActionController::UnknownFormat

to fix that, just add format: :html to your requestes:

before { get :edit, format: :html }
before { post :create, format: :html }
before { put :update, format: :html }
before { delete :destroy, format: :html }

Note: This seems to be related to responders gem, however adding format fixes the issue.

Symbols to strings for some specs

When matching and expecting resources to receive request parameters, you must send them in string format:

# The first one was no longer working because params now convert it to strings
- let(:mailboxer_message_params) { { mailboxer_message: { body: '' } } }
+ let(:mailboxer_message_params) { { 'mailboxer_message' => { 'body' => '' } } }
before/after actions except/only are now blocks internally

I hade some custom extensions for RSpec, that are checking permissions. Up until now, there were (most of the times) symbols inside:

except: %i( index show )

If we wanted to get it back from the controller (outside of requests scopes) we were doing something like that:

before_actions = subject
  ._process_action_callbacks
  .select { |f| f.kind == :before }

unlesses = before_actions.map { |bf| bf.instance_variable_get(:'@unless') }

this part hasn’t change. We still can obtain our unless/only that way. What did change is what is underneath (in what we get).

In Rails 4.2 we would get strings that we could compare with what we’ve expected:

expect(unlesses.flatten.compact.join).to include "action_name == '#{action}'"

Now each element is a block that accepts a stringified action name and responds with a boolean:

unlesses.flatten.compact.each do |callable|
  # callable.call('index') #=> true, etc
  expect(callable.call(double(action_name: action.to_s))).to eq true
end
RSpec config settings that are no longer valid

Remove all that I’ve listed below since they are no longer valid:

  • config.disable_monkey_patching!
  • expectations.include_chain_clauses_in_custom_matcher_descriptions = true
  • mocks.verify_partial_doubles = true

Conclusion

As I’ve stated before: If you have a decent code coverage level and you know what you’re doing, upgrade should not be a big problem. However, I still recommend waiting at least until all the most popular gems are up2date.

Good luck!