I’ve been slowly rebuilding my personal site, from the ground up, with Tableau. My goal is to include as little JS as possible and keep the payloads as light as possible. The main thing is that I want to be working in Elixir as much as possible, which Tableau allows me to do. It’s a lot of fun and, currently, the Tableau project is in the early stages so anything I tweak may be a nice little PR I can open in the future.
Posts in "elixir"
Ecto, iLike you
In Elixir macros are… discouraged, from the docs:
Macros should only be used as a last resort. Remember that explicit is better than implicit. Clear code is better than concise code.
I get it; working in Ruby where every codebase seems to be playing a game of Schrödinger’s Macro it’s refreshing to work in an ecosystem where the code in your editor is what it is. As such I’ve always tried to embrace minimalism in Elixir. Yet Elixir has macros and there are some really good “last resorts” as mentioned above. I’ve encountered one such case a few times when working in Ecto; out of the box Ecto has at least 99% of anything I could ever want in a database wrapper. Over the years there have been an odd handful of cases where I’ve wanted to extend the functionality of Ecto in one way or another. I’m going to provide a few examples of this in action.
Keep in mind both of these examples, out of context may feel a bit contrived and in neither case is the macro reducing the lines of code. However; if placed in the Repo module these macros would make convenient reusable Ecto functions which could be called throughout the codebase.
Combining iLikes
A few years back I was working on some search functionality for a client. Their products were all made to order for specific customers. Allowing customers to search their order history came with several different query options including the name (first, last, email) of the person who placed the order, those same options of the salesperson who placed the order in their behalf or various attributes about the product. This led to a whole chain of joins and ilikes:
val ->
query
|> join(:left, [order: o], u in assoc(o, :user), as: :user)
|> join(:left, [order: o], s in assoc(o, :salesperson), as: :sp)
|> join(:left, [user: u], uc in assoc(u, :user_credentials), as: :uc)
|> join(:left, [salesperson: sp], sc in assoc(sp, :user_credentials), as: :sc)
|> join(:left, [order: o], oli in assoc(o, :order_line_items), as: :oli)
|> join(:left, [oli: oli], prod in assoc(oli, :product_item), as: :prod)
|> join(:left, [prod: prod], config in assoc(pi, :configuration), as: :config)
|> join(:left, [config: config], pt in assoc(config, :product_type), as: :pt)
|> search_function(val)
|> group_by([order: o], o.id)
end
end
defp search_function(query, value) do
str = "%#{value}%"
query
|> where([order: o, uc: uc, sc: sc, conf: conf, pt: pt],
ilike(o.po_number, ^str) or
ilike(uc.email, ^str) or
ilike(uc.firstname, ^str) or
ilike(uc.lastname, ^str) or
ilike(sc.email, ^str) or
ilike(sc.firstname, ^str) or
ilike(sc.lastname, ^str) or
ilike(pt.name, ^str) or
ilike(pt.design_id, ^str)
)
end
It’s readable enough, especially the joins; I’d argue that Ecto’s elegant syntax actually makes this slightly more readable than a standard SQL statement but search_function is a bit much; to the point where Credo started lighting up cyclomatic complexity warnings
There was a better way. Maybe not for all cases; frankly if I hadn’t been warned about the complexity I would have called it day here. I thought it would be fun to condense this and pipe all joins into a smaller search_function somehow with fewer ilikes. This is where one can make good use of macros and Ecto:
defp search_function(query, value) do
str = "%#{value}%"
query
|> where(
[order: o, uc: uc, ic: ic, bd: bd, obi: obi],
multiple_ilike([:email, :firstname, :lastname], uc, str) or
multiple_ilike([:email, :firstname, :lastname], ic, str) or
multiple_ilike([:name, :design_id], bd, str) or
ilike(o.po_number, ^str)
)
end
defmacro multiple_ilike(keys, schema, value) do
Macro.expand(ilike_irr(keys, schema, value), __CALLER__)
end
defp ilike_irr([key | keys], schema, value) do
quote do
ilike(field(unquote(schema), unquote(key)), ^unquote(value)) or
unquote(ilike_irr(keys, schema, value))
end
end
defp ilike_irr([key, key2], schema, value) do
quote do
ilike(field(unquote(schema), unquote(key)), ^unquote(value)) or
ilike(field(unquote(schema), unquote(key2)), ^unquote(value))
end
end
Working from the top this takes our lines of code from 9 to 4 still making just as many ilike calls. I would have employed multiple_ilike/3 for orders as well if we were searching more than one column.
It’s fairly standard recursion in Elixir, made only a little more frightening with the quoting and unquoting of macro code and passed in runtime values.
To illustrate lets call it: multiple_ilike([:email, :firstname, :lastname], user_credentials, "%trav%") . The recursive call to ilike_irr/3 needs at least two columns (although one could handle a single column for a safer API). Each iteration uses Ecto’s ilike#2 function it takes your list of columns (keys) the table (schema) and the search string. We unquote these values because they are not part of the macro ie we want them to be whatever we are passing in. The first iteration above is going to add to the query: ilike(field(user_credentials, :email), "%trav%")) fairly straightforward (if you aren’t familiar with Ecto, field/2 is a way of dynamically accessing a column which we need because we the Macro won’t know the schema/keys being passed in at compile time).
This initial ilike/2 is appended with an or/2 or in regular SQL “or” and the macro is called again. ilike(field(user_credentials, :firstname), "%trav%") which makes up the right hand side of the or we continue in this fashion until there are only 2 keys left at which point we return both ilike queries having a fully formed statement with multiple ilike ... or ilike ... statements chained together.
I love stuff like this; Ecto already feels like magic (not because it’s obfuscating anything just because of how smooth things are) and this lets me add a few ingredients of my own to the potion.
Multi-tenancy with Phoenix and Elixir
There are lots of good places to start with multi-tenancy in Elixir (although I’d recommend Ecto’s own docs for either foreign keys or postgres schemas ). Most of the write-ups and tutorials start the same way “generate a new Phoenix application with mix phx.new “. While, this is great if your starting an enterprise SASS app from scratch but it leaves something to be desired if you, like I was, are migrating an existing codebase with thousands of users, and products to a multi tenant application. I recently went through this with an enterprise client and there were enough pitfalls and interesting problems to solve that it seemed to warrant a detailed post.
I believe the solution I put together is both effective and elegant but it is not without it’s pain points. Mainly, if you are going to use PostgreSQL schemas (which I did) you are going to have to migrate your existing data into said prefixes. There is no easy way around this, it’s just a slog you have to do; more on that later.
Schemas?
I went back and forth for a while, I finally settled on query prefixes as they felt a little more elegant; segmenting data without having to add new foreign keys to any columns. It also makes it easy to migrate or wipe customer data if needed. Admittedly, if your managing tens of thousands of tenants in a single database this approach will be a bottleneck. In my case that was not a concern; there are two current tenants and the client only expects to add a few tenants ever year if that.
As mentioned, Ecto has great docs on setting up schemas; however I opted to use a dependency called Triplex mostly for the sake of time (about a week in I realized I could have rewritten most the required features in a day or two but we had about a month to make this transition so a refactor at this point seemed like overkill). Schemas work because we are using PostgreSQL, you can kind of hack together “schemas” with MySQL but under the veil it’s just separate databases, I can’t vouch for that approach because my Elixir projects are mostly in Postgres.
The first big hurdle is ensuring that your queries are run in the the right schema. By default Ecto is going to run queries in the public schema. On any given query you can change this by passing in a prefix: option, ie: Repo.one(query, prefix: "some_prefix"). Now rewriting hundreds or thousands of Repo actions with a variable prefix is not exactly convenient but it’s imperative to ensure queries are scoped to the correct schema. Just imagine the catastrophic breach if you had Customer A getting back Customer B’s data!
Thankfully you do not have to rewrite all your queries explicitly calling a prefix. There are some handy built-in behaviours from Ecto.Repo. Enter Repo hooks! Ecto.Repo comes with some great behaviours that allow one to effectively write Repo.one(query, prefix: "some_prefix") without actually writing it for every single query! You can implement prepare_query/3 which to filter and modify the prefix. You add these hooks to YourApp.Repo This is prepare_query/3 in it’s simplest form:
@impl true
def prepare_query(_operation, query, opts) do
opts = Keyword.put(opts, :prefix, "some_prefix")
{query, opts}
end
Now all queries will be looking at the some_prefix prefix rather than the public prefix. In our app we had a few tables that we wanted scoped to the public query? For example you may have an admins table, or possibly oban_jobs , tenants , etc. You can handle this in a few ways:
@impl true
def prepare_query(_operation, query, opts) do
if opts[:skip_prefix] do
{query, opts}
else
opts = Keyword.put(opts, :prefix, "some_prefix")
{query, opts}
end
end
This works although it necessitates passing skip_prefix: true to all your Repo calls; likely fewer then before but still kind of defeating the purpose of prepare_query/3 .
@sources ~w[admins oban_jobs oban_peers customer_pricing]
@impl true
def prepare_query(_operation, %Ecto.Query{from: %{source: {source, _}}} = query, opts) when source in @sources do
{query, opts}
end
def prepare_query(_operation, query, opts) do
...
end
By pattern matching on your allowed tables you can bypass your prefix override. I used a combination of both of the above approaches with a list of allowed source tables as well as the option to skip_prefix which adds an manual override to the API. In theory you shouldn’t need it but you never know, tests, edge cases, shrugs…
Tenant Selection
At this point we’ve converted every query in the application to use a dynamic prefix in about 10 lines of code. Not bad but it’s also not dynamic, I’ve hard coded some_prefix into my queries. Before we make the actual hook dynamic we need to determine how Phoenix is going to recognize the tenant. There are many ways of doing this, in my case, for now, we are using subdomains.
Since the subdomain is available on the conn.host, I set up a plug to fetch the subdomain:
defmodule MyApp.TenantPlug
...
def selct_organization_from_domain(conn, _opts) do
subdomain = get_subdomain(conn)
put_session(conn, :tenant, subdomain)
end
defp get_subdomain(%{host: host}) do
[subdomain | _] = String.split(host, ".")
subdomain
end
This gets the subdomain and puts it in the session (which is not strictly necessary but is nice to have). Next lets pass it to Repo; as with the queries, one need not rewrite all Repo calls passing in a :subdomain option, here Elixir/Phoenix has your back. In Phoenix, each browser session is a unique process and that process can pass data to itself. Back in Repo I added these little helpers:
@tenant_key {__MODULE__, :tenant}
def put_tenant_subdomain(subdomain) do
Process.put(@tennat_key, subdomain)
end
def get_tenant_subdomain do
Process.get(@tenant_key)
end
Now back in the TennatPlug we can add the subdomain to the process:
def selct_organization_from_domain(conn, _opts) do
subdomain = get_subdomain(conn)
Repo.put_tenant_subdomain(subdomain)
put_session(conn, :tenant, subdomain)
end
A second Repo behaviour can be used to pass options to the Repo call: default_options/1 . Rather than explicitly writing opts = Keyword.put(opts, :prefix, "some_prefix") in the prepare_query/3 hook default_options/1 will set up your opts before the Repo function runs. From there we call get_tenant_subdomain/0 to retrieve the subdomain/query prefix we set in the plug:
@impl true
def default_options(_operation) do
[prefix: get_tenant_subdomain()]
end
@tenant_key {__MODULE__, :tenant_subdomain}
def get_tenant_subdomain, do: Process.get(@tenant_key)
Like prepare_query/3 , default_options/1 will run with every query.
With this implemented, navigating to a specific subdomain will set the tenant in the current process (as well as in the session) and any database queries in that session will be scoped to the tenant’s schema. Putting it all together we have something like this in repo.ex
@allowed_sources ~w[oban_jobs tenants]
@impl true
def default_options(_operation) do
[prefix: get_tenant_subdomain.get()]
end
@impl true
def prepare_query(_operation, %Ecto.Query{from: %{source: {source, _}}} = query, opts)
when source in @allowed_sources do
opts = Keyword.put(opts, :prefix, "public")
{query, opts}
end
def prepare_query(_operation, query, opts) do
if opts[:skip_prefix] do
{query, opts}
else
opts = Keyword.put(opts, :prefix, "some_prefix")
{query, opts}
end
end
@tenant_key {__MODULE__, :tenant}
def put_tenant_subdomain(subdomain) do
Process.put(@tennat_key, subdomain)
end
def get_tenant_subdomain do
Process.get(@tenant_key)
end
The simplified version of my tenant_selection_plug.ex looks like:
def selct_organization_from_domain(conn, _opts) do
subdomain = get_subdomain(conn)
Repo.put_tenant_subdomain(subdomain)
put_session(conn, :tenant, subdomain)
end
defp get_subdomain(%{host: host}) do
[subdomain | _] = String.split(host, ".")
subdomain
end
end
In production we are handling a lot more such as authorization with Guardian but this show how simple it is to get a subdomain and add it to the session.
The above is a fairly bare-bones approach our final project had a lot more customization and ended up being organized a bit differently; for example, we extracted functions dealing with getting and setting @tenant_keys in the process to their own module. My hope is that the above lays the groundwork for anyone looking to do something similar.
Data Migration
I wish I had a solution half as slick as Ecto’s behaviours make querying database schemas. I was unable to find an elegant way to migrate relevant data to specific schemas so I was forced to do it with good old SQL.
-- compy customers
INSERT INTO salt_lake.locations SELECT * FROM public.locations WHERE id = 'salt_lake_location_id';
-- copy customers
INSERT INTO salt_lake.customers SELECT * FROM public.customers WHERE location_id = 'salt_lake_location_id';
I had about 50 queries similar to this. Fortunately, tenants were mapped to locations and at the time of the migration the client only had two tenants (the system was migrating from a product business to a consulting business). I ran these queries twice replacing salt_lake with bakersfield on the second iteration. In my case due to the way the system was originally designed to work with an external system (look’en at you Quickbooks) and some changes the customer was making to how that system would be used this migration ended up being a bit more harry than expected. I had to write several ad-hoc queries that looked less like the above and more like:
INSERT INTO salt_lake.qb_orders SELECT qb.* FROM qb_orders qb JOIN orders o ON o.qb_order_id = qb.id JOIN customers c on o.customer_id = c.id WHERE NOT EXISTS (SELECT 1 FROM salt_lake.qb_orders slcqb WHERE slcqb.id = qb.id) AND c.name ILIKE '%A Problematic Customer%'
Again, that’s not the fault of the multi-tenancy setup, migrating data in any complex system is always going to have it’s prickly bits. If anyone has ideas for a more elegant migration pattern (first two queries, ignore the last one that an unfortunate specific), I’m all ears, shoot me an email self[at]travisfantina.com.
Personal Heuristic: Make it Readable
I wrote this post back in January, just dusted it off to post today as I attempt to get back on the blogging horse.
Today I was refactoring a small module that makes calls to an SAP endpoint. The compiler got hung up because it couldn’t find the value item. It was an easy fix, my code looked like this:
for itm <- data do
%{"MATNR" => material, "PSTYV" => category, "VBELN" => so} = item
%{material: material, category: category, so: so}
end
It’s easy to spot (especially if the compiler tells you exactly where it is); in the function head I wrote itm but down below I’m looking for item. Simple; yet this is not the first time something similar has happened to me. It’s also not the first time I’ve specifically confused itm with item which led me to this conclusion: just write item every time. There is an odd switch in my brain that thinks I’m penalized by the character, and leaving e out of item will somehow make my code more terse.
While technically true, it’s not worth it. It never is; just write item, everytime. People know what item is. itm is more ambiguous, not just because it only saves one letter, but it could be an abbreviation or some weird naming convention. Why put that mental load on someone, even yourself, reading through this code?
This is a tiny example but it’s magnified in function names. While check_preq may be quick to type and take up less horizontal space in an editor it’s not immediately clear what this function does. I would argue that get_purchase_requisition_number is a much better function name; even if you know nothing about the function, the codebase, or programming in general you can read that and know what’s supposed to happen.
Of course there are conventions, ie. ! dangerous or ? bankbook method endings in Ruby ie. exitst? will throw an error. These sorts of things require one to be a little familiar with the patterns of a language but that’s ok that just means that I can write a function get_purchase_requisition_number! and anyone familiar with Ruby or Elixir will expect the function to raise or return an explicit value (as opposed to something wrapped in an :ok tuple).
Moving forward I’m calling things what they are even if it comes with a dash of verbosity.
Adding a `soft_delete` to Ecto Multi pipelines
I’m a big fan of Ecto, Elixir’s database wrapper. The Multi library lets you build up a series of operations that happen in order, if one fails the entire operation rolls back. Multi comes with the a lot of standard CRUD built in, insert/4 , update/4 , delete/4 and their bulk counterparts insert_all/5 , update_all/5 and delete_all/5 for acting on multiple records.
I’ve been working on a project where we make use of the soft delete pattern, rather than calling delete/4 on a record we generally update/4 the record passing in a deleted_at timestamp:
|> Multi.update(:soft_delete, fn %{customer: customer} ->
Changeset.change(customer, %{deleted_at: now})
end)
This works fine, and even updating multiple records one could take this approach:
|> Multi.update_all(:soft_delete, fn %{customers: customers} ->
ids = Enum.map(customers, & &1.id)
from(c in Customer, where: c.id in ^ids, update: [set: [deleted_at: ^now]])
end, [])
I was working on a new feature that will require a cascade of soft deletes, deleting multiple records, their associated records, their children, etc. (As the second example above is doing). Admittedly, I could have just utilized this Multi.update_all/5 and put multiple steps into the multi . However; I thought continuously mapping specific ids, passing in set: [deleted_at: ^now] was a little cumbersome and not very idiomatic. Mostly, I wanted to have a bit of fun wondering: “what if Ecto.Multi had a soft_delete_all/5 function?” Of course it doesn’t, this is a niche use case but it makes sense in this application so I dug in and found the task to be (as is the case with a lot of Elixir) surprisingly easy.
Just like update_all/5 I wanted to make sure soft_delete_all would handle queries or functions passed in. Pattern matching here using the is_function/1 guard. This made it a fairly straightforward operation:
@spec soft_delete_all(Multi.t(), atom(), fun() | Query.t(), keyword()) :: Multi.t()
def soft_delete_all(multi, name, func, opts \\ [])
def soft_delete_all(multi, name, func, opts) when is_function(func) do
Multi.run(
multi,
name,
operation_fun({:soft_delete_all, func, [set: [deleted_at: Timex.now()]]}, opts)
)
end
def soft_delete_all(multi, name, queryable, opts) do
add_operation(multi, name, {:update_all, queryable, [set: [deleted_at: Timex.now()]], opts})
end
The first function matches against functions while the second matches against a queryable. I’ll explain the distinction between both.
Under the hood Multi is already equipped to handle functions or queryables; by reading the source of the Multi module I was able to,matches, forward along the proper structure for the Multi to run, and in another case recreate the same functionality that Multi.update_all uses. Both operation_fun/2 and add_operation/3 are nearly copy-pasted from the Multi core.
In the first instance the multi is passed a function, something like:
|> soft_delete_all(:remove_customer, &remove_customer/1)
In this case Ecto adds a new Multi operation to the pipeline: Multi.run/3 but it needs to run the function it’s passed. It does this with operation_fun/2 . The multi has several matchers for each of the bulk operations, in my case I only needed one :soft_delete_all .
defp operation_fun({:soft_delete_all, fun, updates}, opts) do
fn repo, changes ->
{:ok, repo.update_all(fun.(changes), updates, opts)}
end
end
Again, this is identical (save the :soft_delete_all atom) to the Multi module. It runs our function which creates a query, it passes our update: [set: [deleted_at: Timex.now()]] to the query and then updates the record.
In cases where we pass a query in:
|> soft_delete_all(:remove_customer, Query.from(c in Customer, where: c.id == 123))
We match on the next function head, here again I used Ecto’s pattern writing my own custom add_operation/3
defp add_operation(%Multi{} = multi, name, operation) do
%{operations: operations, names: names} = multi
if MapSet.member?(names, name) do
raise "#{Kernel.inspect(name)} is already a member of the Ecto.Multi: \n#{Kernel.inspect(multi)}"
else
%{multi | operations: [{name, operation} | operations], names: MapSet.put(names, name)}
end
end
This is going to first check that the operation name isn’t already in the Multi. If it’s not, we append the operation into the Multi. This works because of the parameters we’ve passed it:
add_operation(multi, name, {:update_all, queryable, [set: [deleted_at: Timex.now()]], opts})
end
Specifically: {:update_all, queryable, [set: [deleted_at: Timex.now()]], opts} once again, we aren’t doing anything fancy to soft delete these records, we are using Multi’s ability to :update_all with our provided queryable. The update we are making is [set: [deleted_at: Timex.now()]] .
There you have it, it’s :update_all all the way down, which makes sense because we are updating a record instead of deleting it, but I think it’s a lot cleaner to write something like this:
query1 = from(c in Customer, where: c.last_purchase <= ^old_date)
query2 = from(u in User, join: c in assoc(u, :customer), on: c.last_purchase <= ^old_date)
Multi.new()
|> soft_delete_all(:customers, query1)
|> soft_delete_all(:users, query2)
#👆don't judge this contrived example it's not production code
Today I Learned ~D[2024-01-03]
You can use Erlang’s tc function to see how many microseconds a function takes. For example, say you were curious if Enum.filter/2 or Kernel.--/2 took longer:
Example:
$iex> vals = [1, 2, 3, 4, 5]
$iex> :timer.tc(Enum, :filter, [vals, &rem(&1, 2) == 1])
{20, [1, 3, 5]}
$iex> :timer.tc(Kernel, :--, [vals, [2, 4]])
{3, [1, 3, 5]}
Kernel.-- or vals -- [2, 4] took 3 micro seconds while Enum.filter/2 (Enum.filter(vals, & &1rem(&1, 2) == 1)) took 20.
This is a fairly trivial example but I could see this coming in handy with larger operations.
For more detailed analysis you can always use Benchee.
Thanks to chriserin for helping me get the right Erlang syntax for tc