Don Pflaster

Ecto DateTime arithmetic

Working with timestamps in Elixir is still very much a point of contention within the community. There is a pretty robust library emerging called Timex, but it is not now part of the core of Elixir or Phoenix. I think using native Erlang is probably the simplest guaranteed-to-work solution for performing arithmetic on timestamps.

In the following example, I want an Ecto.DateTime which is 500 seconds later than the current time. Probably the most convenient way of doing this is to grab the current utc time, convert it to a native erlang epoch, and then cast it back.


(:calendar.datetime_to_gregorian_seconds(Ecto.DateTime.to_erl(Ecto.DateTime.utc)) + 500) |>
:calendar.gregorian_seconds_to_datetime |>
Ecto.DateTime.cast
Posted in Ecto, Elixir

Sample Phoenix Framework Controller

Hi all! Happy New Year. I’ve been playing around recently with the Phoenix Framework quite a lot.
While I was learning it, I found myself wishing that a full-fledged example of a traditional MVC controller existed out there. I found a lot of this information in many different places.

Before we look at the controller, let’s look at a “changeset”. Changeset structs are passed into Ecto methods such as Repo.update and Repo.insert. It represents the delta of fields from the original object. This is something you define within your model, and it saves you a bunch of the trouble you would have converting all of the different pieces of information in your params hash into the appropriate datatypes. If you have any additional validations for that model, you can add them to the pipeline.


@required_fields ~w(number)
@optional_fields ~w(name)

def changeset(model, params \\ :empty) do
model
|> cast(params, @required_fields, @optional_fields)
end

This is pretty standard – if you have any other funky field types that don’t come standard with Ecto, such as file attachments handled with arc, this might have some additional methods.
The controller with standard REST routes will look like this:


defmodule MyApp.ThingController do
use MyApp.Web, :controller

alias MyApp.Thing
alias MyApp.Endpoint
alias MyApp.Repo
alias MyApp.Router.Helpers
require Logger
import MyApp.SessionController, only: [authenticate: 2]
import Ecto.Query, only: [from: 2]

plug :authenticate

def index(conn, _params) do
query = from a in Thing,
order_by: [a.name]
things = Repo.all(query)
render conn, "index.html", things: things
end

def show(conn, %{"id" => id}) do
render conn, "edit.html", thing: Repo.get!(Thing, id)
end

# Note: Changesets are used when we expect to change data
def edit(conn, %{"id" => id}) do
thing = Repo.get!(Thing, id)
render conn, "edit.html", changeset: Thing.changeset(thing)
end

def new(conn, _params) do
render conn, "new.html", changeset: Thing.changeset(%Thing{})
end

def create(conn, %{"thing" => thing_params}) do
Thing.changeset(%Thing{}, thing_params) |>
Repo.insert
redirect conn, to: Helpers.thing_path(Endpoint, :index)
end

def update(conn, %{"id" => id, "thing" => thing_params}) do
Repo.get!(Thing, id) |>
Thing.changeset(thing_params) |>
Repo.update
redirect conn, to: Helpers.thing_path(Endpoint, :index)
end

def delete(conn, %{"id" => id}) do
Repo.get!(Thing, id) |>
Repo.delete!
redirect conn, to: Helpers.thing_path(Endpoint, :index)
end
end

Posted in Ecto, Elixir Tagged with: , ,

find_or_create with Ecto

The Ecto library in Elixir apparently does not have a native find_or_create method comparable to Activerecord as of the time of this writing. However, here is a nifty method you can drop into one of your Ecto models to achieve this behavior.

In the following example, we want to find or create by the user’s email.


def find_or_create(user) do
query = from u in User,
where: u.email == ^user.email
Repo.one(query) || Repo.insert!(user)
end

Thanks to info found here!
http://sushruth.me/index.php/13-ecto-associations-part-1-1.html

Posted in Uncategorized

Locking Records with Elixir’s Ecto

As a followup to a previous post about locking records with transactions, I’m now going to show how it’s done with Elixir, and specifically the Ecto library (comparable to ActiveRecord).

Instead of using:

lock: true
Ecto asks you to pass the specific type of locking scheme. In the case of pessimistic locking, that is:
lock: "FOR UPDATE"
That can just be passed into your query with the other parameters.

And of course, be sure to wrap all of this in a transaction, otherwise your lock will not stick around until you’re done working with the record:


Repo.transaction(fn ->
...
end)

Full example:


import Ecto.Query
alias MyRepo.Repo

Repo.transaction(fn ->
query = from a in Account,
where: a.company_id == ^1,
lock: "FOR UPDATE"

account = Repo.all(query) |> List.first

new_balance = account.balance + 5
Repo.update!(%{account | balance: new_balance})
end)

Found here:
https://hexdocs.pm/ecto/0.2.5/Ecto.Query.html#lock/2

Posted in Ecto, Elixir

Headless browsers and proxy authentication

For my most recent gig, I’ve been trying to drive Firefox headlessly using Selenium and a pretty excellent Elixir library called Hound.

One of the tasks I needed to accomplish was to connect to sites through a proxy server that required a username and password.

This is relatively trivial to do in Firefox when you have an interface, and you are able to type in the credentials when the box pops up. But driving it headlessly is a different animal – I spent the better part of the day trying to figure out where this username and password should live and how to even pass it through my app to begin with.

Save yourself the trouble and set up a separately-running proxy running on your local system (using a package such as Squid). This should be able to effortlessly connect to the proxy and authenticate, and all you would need to do is connect to your own local proxy without a user/pass combination. This is a great way to separate the connection logic from the application logic, and if you ever want to not use the proxy, just point to a different Firefox profile (or however your browser of choice accepts proxy configurations).

Posted in Uncategorized

Closure Tree hierarchy counting

We use a nifty gem called Closure Tree in our Rails app to manage our application’s file manager hierarchy. It stores the entire hierarchy in a separate table so you don’t have to traverse the parent IDs recursively every time you want to find the ancestors or descendants.

But what if you want the counts of all the child objects that meet a certain criteria? You could sum all of them up from the bottom of the tree using Ruby, but much better to let your database do this for you using count(*).

For instance – our files and folders are stored in the same table, and we have an identifier. Furthermore, not every user has permission to see every file.

We came up with a simple enough class method to return a hash of each folder ID you pass in and the number of files in each:


base_scope = FileBase.not_folders.
joins("INNER JOIN file_basis_hierarchies ON file_basis_hierarchies.descendant_id = file_bases.id").
where("file_basis_hierarchies.ancestor_id IN (?)", folder_ids)

base_scope = base_scope.allowed_to_see(user) if user

Hash[
base_scope.allowed_to_see(user).
group('file_basis_hierarchies.ancestor_id').
select("file_basis_hierarchies.ancestor_id as id, count (*) as file_count").map { |dc| [dc.id, dc.file_count] }
]

Posted in Uncategorized

Elixirin’

Recently got roped into a side project for an old colleague and building something in Elixir using the Phoenix Framework.

I’ve been using object-oriented (or somewhat OO languages like Perl) for the last 15+ years, so working with Elixir and Erlang is certainly a challenge. I often find myself cursing the loss of convenience associated in having to pass references to everything, frickin’ everywhere.

After stumbling over some blocks and coming out a bit smarter on the other end, I think I’m finally starting to enjoy it. Oh my, is it ever fast. It feels so lean and powerful, and very much like the future. Frustration may mount for now, but things always get better.

Pragmatic just released a Programming Phoenix book which is still in beta and a bit light on the material I’m really interested in (like Channels), but this too will only get better.
https://pragprog.com/book/phoenix/programming-phoenix

Posted in Uncategorized Tagged with: , ,

What the value was

Been programming with Rails for well over three years now, and only now had the need to know what an attribute was before you changed it.


2.2.0 :007 > c.first_name
=> "Nick"
2.2.0 :008 > c.first_name = 'Johnny'
=> "Johnny"
2.2.0 :009 > c.first_name_was
=> "Nick"

May be old hat for a lot of ya, but awesome when you get something for free.

Posted in Ruby on Rails

Nifty PostgreSQL Common Table Expression Update

D’ja ever want to target a specific group of rows that meet a condition in a database table, join it with information from other tables, and then perform an update on them in a single command? PostgreSQL’s Common Table Expressions (CTE) work pretty handily for doing just that, though I rarely see the solution posted conspicuously.

In the case below, I am trying to set a new name field in table 1 based on the names of some underlying tables. The record that has the name might be in table 2 or table 3.


with rows_to_adjust as (
select table1.id, coalesce(table2.name, table3.name) as name_to_set from table1
left outer join table2 on table1.table2_id = table2.id
left outer join table3 on table1.table3_id = table3.id
where table1.name is null
)
update table1 set name = rows_to_adjust.name_to_set
from rows_to_adjust
where rows_to_adjust.id = table1.id;
Posted in PostgreSQL

Fun with race conditions, concurrency, and deadlocks

Sometimes, wheels have to be a little squeaky before we can learn how to properly oil them. Once, a race condition was discovered within our Rails codebase that required a deep dive into concepts of concurrency which I hadn’t yet encountered in my career.

What’s a race condition?

A race condition occurs when two different processes attempt to update the same value simultaneously, and are unaware that the other process is doing it. Usually you would encounter this when multiple Ruby processes have connections to the same database, but you can reproduce the concept on the command line.

An example would be two transactions that both add $5 to the same account balance:


account[0] = Account.find(1) # First instance pulls balance of $10
account[1] = Account.find(1) # Second instance pulls balance of $10
account[0].balance += 5 # First instance Increments balance by $5
account[1].balance += 5 # Second instance Increments balance by $5
account[0].save # First instance writes back $15 as new balance
account[1].save # Second instance also writes back $15

This clearly saves back a balance of $15, where it should now be $20.

Aren’t there other ways of solving the issue without storing the balance?

The original designers of the system chose to sum the balance every time a new transaction occurs. For instance, within your transaction table, you could have a callback that triggers a balance recalculation on the account every time a transaction is saved:


class Account < ActiveRecord::Base
has_many :transactions

def update_balance!
self.balance = transactions.sum(:credits) - transactions.sum(:debits)
self.save!
end
end

class Transaction < ActiveRecord::Base
belongs_to :account
after_save :update_balance
def update_balance
account.update_balance!
end
end

This will work happily on your development machine while you’re prototyping, but as the number of transactions in your database increases in the production system, this will take increasingly longer to process. ActiveRecord will begin locking records during the update, and any action that creates transactions (including placement of orders) will be affected. Better to go with storing a balance in an accounts table, and updating it as needed.

Concurrency and Deadlocks

The best way to ensure that no two processes are able to update the balance simultaneously is to lock the record that’s being updated until the process has successfully saved it. This can be done in PostgreSQL, the DB of our choice, by using SELECT FOR UPDATE. In order for such a select to be effective, it must occur within a transaction, which in the example below is bookended by BEGIN and COMMIT.


BEGIN;
SELECT * FROM ACCOUNTS WHERE id = 1 FOR UPDATE;
UPDATE ACCOUNTS SET balance = 15 WHERE id = 1;
COMMIT;

While the block is being executed, other sessions can read the balance but no other session connected to the database can perform its own SELECT FOR UPDATE or perform an UPDATE on that record. They have to wait in a queue until this transaction is finished.

The equivalent of doing this in Ruby with ActiveRecord is to wrap the entire operation in a transaction (which will ensure that everything you do in that block falls between BEGIN and COMMIT), and lock the record:


Account.transaction do
account = Account.find(1, lock: true)
account.balance += 5
account.save
end

This appeared to work well when we used a multi-threading test gem, but we discovered another issue completely by accident. In some instances, while we were updating balances on the account, other processes were adding new transactions. This created a deadlock – and the reason was not immediately apparent, generating this cryptic error:


SELECT 1 FROM ONLY
"public"."accounts" x WHERE "app_id" = $1 FOR SHARE OF x
ERROR: deadlock detected
Process 20480 waits for ShareLock on transaction 74278; blocked by process 20481.
Process 20481 waits for ShareLock on transaction 74275; blocked by process 20480.

What’s going on here? After Googling this, it turns out that referential integrity checks at the database level also perform their own SELECT FOR SHARE, which would also happily wait for a SELECT FOR UPDATE to finish under normal circumstances. But since we were testing with hundreds of new transactions doing INSERTs and UPDATEs from different sessions, they end up crossing paths and waiting for each other to finish. This makes sense – the application can’t have other processes changing the record you’re checking for integrity. But it seems if you want to get around this intractable problem, you need to make sure that the field you’re storing is not going to be sitting on a table that other tables reference. It was decided that we should move the balance field into its own table that would exist only to reference the account and not be referenced by anything.

How did you simulate the multi-threading environment?

Oh! That was pretty cool. Our sysop pointed me to a sweet gem called “parallel”. It allows you to do cool stuff like the following, which sends an array of numbers to be printed in four separate processes:


Parallel.each([1,2,3,4], :in_processes => 4) do |e|
puts e.to_s
end

See more here:
https://github.com/grosser/parallel

Posted in PostgreSQL, Ruby on Rails