Thursday, 4 March 2010

Working with Postgres views, Rails models and user-based security

So the biggest flaw in Rails at the moment is the fact that it assumes logging in with full application privileges to the database. This has many holes in it but the huge chasm, the barn door in terms of security breaches is that your data, potentially private data, is seriously at risk.

As awesome as Rails is, it cannot guarantee you that you will not make a mistake. One unfortunate bug that fails to implement user-based filtering of data and the wonders of Rails' ActiveRecord quickly becomes that proverbial thread on the thorn... the jumper unravels very quickly and the data security breach slide becomes an avalanche.

That is why database technology, since the mid 1900s, has had user-based security built into it. Users should be accessing data through views and updates should be made through triggers and rules. Your application should not be accessing the raw, underlying table. Postgres, Oracle, Ingres, DB2 and other "enterprise" databases all have a long track record of implementing these facilities. In fact, it would be a very foolhardy IT manager that sanctions the use of a database that doesn't have these services.

The correct place for data security to reside is within the database. Implementing this in the application is a highly insecure practice (a discussion worthy of several blog posts in its own right). Not only is leaving security up to the application insecure, it also introduces the very serious risk of bugs and "forgetting" to implement security during the natural maintenance lifecycle of the application. In addition to this, every application needs to implement the same security features for the same database, something that flies in the face of Rail's DRY principle and drives up the cost, complexity and maintenance overhead of the security system exponentially. For example, tweaking just one security rule forces an update, test, release cycle for every single application using that database.

The correct database implementation for data security within the database follows this pattern:

1) The DBA account owns the tables and has read/write/update/delete access to these tables. Specifically, the user accounts do not have any CRUD permissions on the underlying tables.
2) The DBA account creates views, rules and triggers on the tables. These have built in security and data filtering that restricts users to see only that data that they are entitled to see. User-based security for selects is implemented through views, user-based security for update, delete and insert functions is implemented through triggers and rules on the views.
3) The application logs in as a user account.
4) The DBA account grants only those permissions needed on each view/trigger to each user account required for the actions that the user in question has permissions to perform on that data.
5) The views restrict the user to just those rows (and columns) that they have permission to access. This is based on the user's login.
6) The triggers and rules ensure that the underlying data integrity is maintained, including the injection of security details (e.g. maintaining a user_id column or audit trail) where these are needed.

This approach radically simplifies the design and maintenance of real world web applications. It also reduces the risk of bugs and the introduction of security flaws through errors in application modifications and it provides a consistent enforcement of data integrity across all uses of the database, no matter how many applications use that database.

By maintaining a user_id column on each table, it is possible to write a generic view that restricts the user to only those rows that (s)he owns.

Take, for example, an accounting package. If you have the table salary that looks like this:

create table salary (
   id serial primary key,
   employee_name text,
   salary float,
   user_id int references users(id));

Obviously you would have a users table as well, for example:

create table users (
   id serial primary key,
   username text)

In the users table, the username column is the user's database account.

With the above salary table, you can could write a view to restrict user access to only "their" rows like this:

create view salary_vew as
  select, s.employee_name, s.salary
  from salary s join users u on
  where u.username::text = "current_user"()::text;

Thus, when the user selects from the salary_view, they will only see the rows containing their user_id. Note that we don't include the user_id in the view as this is part of the security system, not the application data.

Now, to implement insert, update or delete actions security, you need to implement rules on the view. This means that your application will be able to perform insert/update/delete actions on the view just as it would a normal table, with the rules system in Postgres implementing the actions on the real table (salary) with the added security constraints.

At this point, it's worth noting that this security model is very simple. In reality, security models can be complex beasts. For example, most companies have teams of people, i.e. user groups, who are able to access the same data, so restricting access by user_id is too simple. The views, rules and triggers will need to be group-aware and perform group lookups, allowing someone in the same group as the row "owner" to access that row with the same rights as the row owner.

This point highlights the power of this technique. Because the underlying security model can be modified and maintained independently of the applications that use the database as well as allowing for an implicit interaction between applications without needing to integrate them. For example, the company's HR system, which maintains who is employed in the company and what team they work in could be updating the group and user information held in the database as well as creating or deleting database accounts, independently of the applications that allow users to log in and work with the data in that database. Reporting systems can be implemented that report on data in the database, etc.

So far we have seen how to implement select-based security. And here are some examples of CRUD security.

For inserts, you would create an insert rule:
create or replace rule salary_ins as
  INSERT INTO salary (id,employee_name,salary,user_id)
  SELECT nextval(salary_id_seq'), NEW.employee_name,NEW.salary,
  FROM users u
  WHERE u.username::text = "current_user"()::text;

The above will create a rule that populates the user_id correctly based on the user's login. It does this by selecting the correct row of the users table. Thus, even if someone attempted to insert into the salary_view passing the user_id as a parameter, it would be ignored by the rule and the correct user_id will be used for that user's login. This means that attempts to breach the security system by spoofing other users is foiled. For example, if Jane creates a row that Pete should not have access to, but Jane attempts to pass Pete's user_id into the view in order to grant Pete access to that row, the attempt will be blocked at a database level.

The update view is very similar:

create or replace rule salary_upd as
  UPDATE salary s
  USING users u
  SET employee_name=NEW.employee_name,

  SELECT nextval(salary_id_seq'), NEW.employee_name,NEW.salary,
  WHERE u.username::text = "current_user"()::text

NOTE: The last line of the where clause (AND has to be included as this tells Postgres to append all qualifiers that the application may have added to the update. If you did not include that line, then Postgres's rules system would simply throw away any application qualifiers resulting in ALL the rows owned by the user being updated! A rather catastrophic bug!

For example if the application issued thsi:
update salary_view set salary=100 where employee_name = 'Joe Smith';

Then with the last row of the view added, Postgres includes the the "where employee_name = 'Joe Smith'" qualifier with the update specified by the rule, but without the last row in the rule, then every single salary record that the user maintains will have the salary set to 100!

The delete rule is equally as simple:

create or replace rule salary_del as
  DELETE FROM salary s
  WHERE u.username::text = "current_user"()::text

Note that the Postgres data identifier OLD is used, as this is a delete, not an insert. The same comment about the last row (AND and its implications apply to this rule as described for update above.

Quirk in Rails' use of sequences

The above works well in a Rails application, except for the sequence that Postgres maintains for the id column of the salary table. Rails very sensibly calls currval once it has inserted a row into Postgres to discover what the value for id is for the newly inserted row. However, because the table name from Rails' point of view is salary_view, it will attempt to select currval('salary_view_id_seq') which, of course doesn't exist because Postgres would have created the sequence on the salary table as salary_id_seq.

The simplest way around this is to use salary_view_id_seq within the rules and Postgres rather than try and change Rails' default sequence name construction. To do this, you simply declare the id column as type int not serial:

create table salary (
   id int not null primary key,
   employee_name text,
   salary float,
   user_id int references users(id));

And then you create the sequence:
  create sequence salary_view_id_seq;

and tell Postgres to use this for the id value in that table:
  alter table salary alter column id set default nextval('salary_view_id_seq');

You may ask "why bother specifying the sequence for the underlying table if the insert rule populates id in insert anyway?". The reason is to ensure that DBA operations such as bulk copies into the table and any other (legitimate) inserts into the underlying base table will remain consistent with the "normal" inserts via the insert rule.

Of course, this also means that your insert rule needs to use this sequence:

create or replace rule salary_ins as
  INSERT INTO salary (id,employee_name,salary,user_id)

  SELECT nextval(salary_view_id_seq'), NEW.employee_name,NEW.salary,
  FROM users u
  WHERE u.username::text = "current_user"()::text;

Now you have a working security system that enables you to explicitly grant access to users, tweak the security model without application changes, simplify your application development, reduce the number of bugs and the risk of security compromises introduced by bugs and keep it DRY across all the applications and the direct database access!

Of course, you need to make your Rails application log in as the user which you can do using the establish_connection command in your models.

As we use the auth_logic plugin, we establish and the globals $dbuser and $dbpass during application login (which is done via a very restricted general account that has just enough permissions to enable auth_logic to do its stuff),

In our database.yml we define an entry app which looks like this:
      host: localhost
      adapter: postgresql
      database: enterTheDbHere
      username: filledInAtRuntime
      password: filledInAtRuntime
      port: 5432

We then define a model class that our other models can inherit from:

# This class extends the normal Active Record to enable user-login database accounts to be used in the Models that are based on it.
# I.e. It allows us to implement database security and use views and triggers that base permissions on the user's database login

class UserLoginTable  < ActiveRecord::Base
  config = ActiveRecord::Base.configurations['app']



Our models that are based on views are now simple based on the above, for example:

class SalaryView < UserLoginTable
  set_table_name 'salary_view'

Voila! Rails has been enabled with security at a database level. You simply continue to use Rails, define models and write your application as you would have done in the unprotected, DBA-account-using, insecure default manner, but now you have all the benefits mentioned above and many more.

The killer feature of this approach is that it makes your applications simpler, easier, faster to develop and enterprise-class.


  1. This is a very interesting and insightful post, thanks (I can't believe there are no comments on it!). It has inspired me to think about how to migrate my application to this security model asap. There are a few things I'd need to solve and I wonder if you have any thoughts/experience on this one:

    Admin users have to be able to create accounts for other users. That means admin users would need to have privileges to a trigger, perhaps, that would create the accounts/roles. Or is there a better/safer way?

  2. Furthermore, doesn't this setup mean you need many more database connections enabled (Postgres's MAX_CONNECTIONS) than you would with a "regular" Rails setup, depending on the number of database users?

  3. To continue my dialogue with myself...

    I've been doing some experimentation with this and have found a significant issue that I haven't been able to solve yet. If you have foreign key constraints, make use of accepts_nested_attributes_for, and have a form with nested objects (parent-child relationship), then saving a new record will fail. The child record will complain that no parent record exists; for example (from the log):

    INSERT INTO "table_one_view" ("field1", "field2") VALUES(E'Some text', E'Some more text') RETURNING "id"
    SELECT * FROM "table_one_view" WHERE ("table_one_view"."id" = 10138)
    PGError: ERROR: insert or update on table "table_two" violates foreign key constraint "table_two_table_one_id_fkey"
    DETAIL: Key (table_one_id)=(10138) is not present in table "table_one".
    : INSERT INTO "table_two_view" ("table_one_id", "field1") VALUES(10138, E'Some text') RETURNING "id"
    DATABASE EXCEPTION: Statement invalid.

    The reason this happens is that the rule on the view that inserts data into the base table does not actually fire until after all of the original query runs (see below). Crucially, the original query includes creation of the child records, thanks to Rails's accepts_nested_attributes_for. So, even though an id (10138 above) is being returned (from the sequence the view is accessing) for the soon-to-be-created parent record in table_one, the record does not actually exist when the child records in table_two are created (using that id!).

    Setting the foreign key constraint "table_two_table_one_id_fkey" to DEFERRABLE INITIALLY DEFERRED does not help because of the nested transactions. The deferred foreign key constraint is checked at the end of the inner transaction, after the insert into table_two_view, not at the end of the outer transaction, which is where the actual insert into table_one (from the rule) takes place.

    I have verified that everything works ok if we are just inserting into table_one_view without child records in table_two.

    From the PostgreSQL 8.4 manual, section 36.3.1. How Update Rules Work:
    "For ON INSERT rules, the original query (if not suppressed by INSTEAD) is done before any actions
    added by rules. This allows the actions to see the inserted row(s). But for ON UPDATE and ON DELETE
    rules, the original query is done after the actions added by rules. This ensures that the actions can see
    the to-be-updated or to-be-deleted rows; otherwise, the actions might do nothing because they find no
    rows matching their qualifications."

  4. While many PCs often come with a standard application designed to protect the computer and its files from viruses, hackers and malware, the pre-installed software can often be lacking.
    iDeals data room m&a