ADF: Entity-State vs Post-State

There are lots of blog posts on this but all copy and paste the content of EntityImpl javadoc, which is very obscure. I finally found a sensible explanation in Dive into Oracle ADF, but that is usually buried deep in Google results.

I will restate the Dive into Oracle ADF post here, and will add few finer details.

In DB you can run the DML queries as many times as you want without committing them. When we are saving data to DB the call chain is like – transaction.postChanges() -> entity.postChanges() -> entity.prepareForDML() -> entity.doDML(). The entity’s DML operations depend on the post-state. So, transaction.postChanges() effectively make all entities to write their data to DB. They do that by checking their internal post-states and creating appropriate DML queries based on those states. After the DML action the post-state is updated. So, if before postChanges() if the post-state was STATUS_NEW, then after that it would be STATUS_UNMODIFIED, so that on next postChanges() call it doesn’t try to insert the row again.

Entity-state on the other hand represents the entity’s state irrespective of the fact that the changes has been posted (written) to the DB. So, even if all changes in entity are posted, the entity-state remains STATUS_MODIFIED. Entity-state changes only after commit is invoked on the transaction. Unless and until a transaction is committed the changes done in the transaction is not visible in other transactions (this is a feature of relational DB), so, when entity-state changes we know that those changes are now visible to all transactions.

An example:

#!java
row = vo.first();  /* read a row in from DB.  Both EO states are UNMODIFIED */

row.setAttribute("SomeAttr", someValue); /* After this, both states are MODIFIED */

am.getTransaction().postChanges(); /* Changes are written to DB. Trans not committed yet */
                                   /* After this, post-state is UNMODIFIED. Entity-state is MODIFIED */

am.getTransaction().commit(); /* Transaction committed. After this, both states are UNMODIFIED */

If you see EntityImpl’s javadoc then you will notice that we have one extra post-state – STATUS_INITIALIZED. Entities have this post-state only when it is newly created. The moment one of its attribute is set, this changes to STATUS_NEW. When postChanges() is invoked then DML action is skipped if post-state is STATUS_INITIALIZED. This prevents blank rows from getting inserted. Entity-state never has this value since entity-state is meant to be used for writing business logic, and from that perspective STATUS_NEW and STATUS_INITIALIZED makes no difference.

ADF: EntityImpl.refresh(…) has no effect

Many times we use entity.refresh(REFRESH_FORGET_NEW_ROWS | REFRESH_UNDO_CHANGES) to prevent any changes we made to that entity from getting committed to the DB. What the refresh() method effectively does is change the post state of the entity. (See difference between entity-state and post-state.) The post-state is later used to decided which DML operation to use for the entity, i.e. DML_DELETE, DML_INSERT or DML_UPDATE (ref).

I recently came across a code which called refresh() from the entity’s prepareForDML() but that had no effect. ADF documentation says nothing about such a behaviour. I read the EntityImpl code and got my answer there. When we are saving data to DB the call chain is like – transaction.postChanges() -> entity.postChanges() -> entity.prepareForDML() -> entity.doDML().

prepareForDML() and doDML() have first argument operation; this is the DML operation to undertake. The code to decide which DML operation to perform based on entity’s post-state is in EntityImpl.postChanges(). That method invokes prepareForDML() method with the DML operation to undertake. Changing post-state from prepareForDML() or doDML() is useless. So, post-state needs to be set no later than postChanges(). However, if you do need to do this from prepareForDML() or doDML() then you could modify the operation param’s value appropriately.

Create a Tag field using Django-Select2.

The excellent framework – Select2, have had support for tags for a long time, but Django-Select2 lacked that, until now (version 4.2.0).

Tag fields are very much like any multiple value input field except that it allows users to enter values which does not exist in the backing model. So, that the users can add new tags.

For this purpose few new fields have been added to Django-Select2. They all have the suffix – TagField. Few widgets too have been added to auto configure Select2 Js to run in “tagging mode”.

You can see the full reference in docs – http://django-select2.readthedocs.org/en/latest/ref_fields.html and http://django-select2.readthedocs.org/en/latest/ref_widgets.html.

Simple tag field implementation example

You can find this code in testapp too.

models.py:-

#!python
class Tag(models.Model):
    tag = models.CharField(max_length=10, unique=True)

    def __unicode__(self):
        return unicode(self.tag)

class Question(models.Model):
    question = models.CharField(max_length=200)
    description = models.CharField(max_length=800)
    tags = models.ManyToManyField(Tag)

    def __unicode__(self):
        return unicode(self.question)

forms.py:-

#!python
class TagField(AutoModelSelect2TagField):
    queryset = Tag.objects
    search_fields = ['tag__icontains', ]
    def get_model_field_values(self, value):
        return {'tag': value}

class QuestionForm(forms.ModelForm):
    question = forms.CharField()
    description = forms.CharField(widget=forms.Textarea)
    tags = TagField()

    class Meta:
        model = Question

Above I am trying to create a form which the website users can use to submit questions. Things to note in TagField is that it is almost like we use any other AutoModel fields in Django-Select2, except that here we override one new method get_model_field_values().

When users submit a tag field, the usual validation runs, but with a twist. While checking if the provided value exists, if it is found not to exist then the field, instead of raising error, creates that value. However, the library does not know how to create a new value instance, hence it invokes create_new_value() to do that job.

Model version of the tag field already knows that it is dealing with Django models so it knows how to instantiate that, but does not know what values to set for the attributes. So it implements create_new_value() which in-turn invokes get_model_field_values() to get required attribute names mapped to their values.

Before you continue

One key thing to remember is that unlike other widgets, you ideally should not be using tag widgets with normal Django fields. Django fields do not allow creation of a new choice value. However, if you want tagging support but do not want to allow users to create new tags then you can very much use the tag widget here and pair it up with normal Django fields. However, that may not be a good idea, since then UI would still allow creation of new tags, but on submit the user would get an error.

Making it better

We can make this better and interesting. A typical tagging system should have the following features.

  • Ability to detect misspellings. Peter Norvig’s essay on this is excellent. More information can be found on Stack Overflow.
  • Use statistics to order the results. This very much useful when the tag counts ballon up. The statistics could be based on how many tags are frequently used together. Of course when you start a site you would not have any data, in that case for a period of time you can set the algo to only learn.
  • Cache frequently used tags. This is a normal optimization technique which is frequently used. A memcache like layer is usually used to cache the DB data, and if the data is not found there, then a DB hit is made.

Find one’s complement of a decimal number.

One’s complement of a binary number (1001_{2}) is (0110_{2}). That is, we simply flip the bits. This is easy to do in binary. However, if the number is in base 10, then do we do that. Convert that to base 2 then flip the bits and change it back to base 2? Maybe, but if the number is stored in native data types (like int) then we can simply use bitwise not to invert the bits, without any conversion.

In my case it wasn’t so simple. I had a really big number, which had to be stored in BigDecimal. Converting to base 2, back and forth too is very costly. So the efficient algo for that is…

(x’) is the one’s complement of (x) and it has (b) bits, or we are interested in only (b) bits of it.

$$ \begin{align} &x + x’ = 2^{b+1} – 1 \tag{all ones for all b bits} \ &\Rightarrow x’ = 2^{b+1} – 1 – x \ \end{align} $$

So, the equivalent Java code for the above is…

public static BigDecimal onesComplement(BigDecimal n, int totalBits) {
    return new BigDecimal(2).pow(totalBits + 1).subtract(BigDecimal.ONE).subtract(n);
}