Firefox 34 search engine bar: make it stops

New look and feel of the search bar in Firefox 34 makes it hard to use shortcuts. So let's revert it back.

In about:config, toggle the pref browser.search.showOneOffButtons


Alternative Way to Daemonize Java Applications on Systemd (CentOS7/RHEL7)

Source: http://ae.koroglu.org/alternative-way-to-daemonize-java-applications-on-systemd-centos7rhel7/

I already explained how to daemonize java applications on SysV-style system in here. Since CentOS7/RHEL7 comes with Systemd which is a system and service manager for Linux we migrate old init script to the new system.
Again we’ll use our best budy Daemonize but this time we gonna compile it from source because although it’s signed as approved on fedora package db, Daemonize is not in EPEL7 repository for now. I’ll not going to details how to compile install etc. but I assume that you installed daemonize into /usr/local/sbin
So we need to create two files /etc/sysconfig/fixtures and /lib/systemd/system/fixtures.service
This 1st file is where we define java releated variables such as user, java path, arguments, log files etc..
# Configz for fixtures service
ARG1="-Dfile.encoding=UTF-8 -Dproject.properties=/opt/pronet/fixtures/fixtures.properties"
ARG3="-jar /opt/pronet/fixtures/fixtures.jar"
2nd file is service file for fixtures where we define systemd releted variables. There are plenty of documents in Freedesktop Systemd wiki, if you want to know more about I advice you to read them. But roughly unit: consist information about a service, a socket, a device etc, service: information about a process controlled and supervised by systemd and install: installation information for the unit
Description=Fixtures Service
ExecStart=/usr/local/sbin/daemonize -u $JAVA_USER -o $JAVA_STDOUT -e $JAVA_STDERR -c $JAVA_APPDIR $JAVA_BIN $ARG1 $ARG2 $ARG3
ExecStop=/bin/kill -TERM $MAINPID
Let’s start and stop the service
[root@Srv25 pronet]# systemctl start fixtures
[root@Srv25 pronet]# systemctl stop fixtures
if there is something wrong all service files and docker containers insert data into the systemd journal and we can read the journal :)
[root@Srv25 pronet]# journalctl -u fixtures.service
Checking the service status
[root@Srv25 pronet]# systemctl status fixtures
fixtures.service - Fixtures Service
   Loaded: loaded (/usr/lib/systemd/system/fixtures.service; disabled)
   Active: active (running) since Wed 2014-10-29 21:21:49 EET; 13min ago
 Main PID: 28859 (java)
   CGroup: /system.slice/fixtures.service
           └─28859 /usr/java/jdk1.7.0_71/bin/java -Dfile.encoding=UTF-8 -Dproject.properties=/opt/pronet/fixtures/fixtures.properties -Dlog4j.configuration=file:...
Oct 29 21:21:49 Srv25 systemd[1]: Started Fixtures Service.
Enable a service to be started on bootup
[root@Srv25 pronet]# systemctl enable fixtures.service
ln -s '/usr/lib/systemd/system/fixtures.service' '/etc/systemd/system/multi-user.target.wants/fixtures.service'
[root@Srv25 pronet]#
So that’s how it works..


How to select the first/least/max row per group in SQL

Here are some common SQL problems, all of which have related solutions: how do I find the most recent log entry for each program? How do I find the most popular item from each category? How do I find the top score for each player? In general, these types of “select the extreme from each group” queries can be solved with the same techniques. I’ll explain how to do that in this article, including the harder problem of selecting the top N entries, not just the top 1.
This topic is related to numbering rows, which I just wrote about (see my articles about MySQL-specific and generic techniques to assign a number to each row in a group). Therefore I’ll use nearly the same table and data as I used in those articles, with the addition of a price column:
| type   | variety    | price |
| apple  | gala       |  2.79 | 
| apple  | fuji       |  0.24 | 
| apple  | limbertwig |  2.87 | 
| orange | valencia   |  3.59 | 
| orange | navel      |  9.36 | 
| pear   | bradford   |  6.05 | 
| pear   | bartlett   |  2.14 | 
| cherry | bing       |  2.55 | 
| cherry | chelan     |  6.33 | 

Selecting the one maximum row from each group

Let’s say I want to select the most recent log entry for each program, or the most recent changes in an audit table, or something of the sort. This question comes up over and over on IRC channels and mailing lists. I’ll re-phrase the question in terms of fruits. I want to select the cheapest fruit from each type. Here’s the desired result:
| type   | variety  | price |
| apple  | fuji     |  0.24 | 
| orange | valencia |  3.59 | 
| pear   | bartlett |  2.14 | 
| cherry | bing     |  2.55 | 
There are a few common solutions to this problem. All involve two steps: finding the desired value of price, and then selecting the rest of the row based on that.
One common solution is a so-called self-join. Step one is to group the fruits by type (apple, cherry etc) and choose the minimum price:
select type, min(price) as minprice
from fruits
group by type;
| type   | minprice |
| apple  |     0.24 | 
| cherry |     2.55 | 
| orange |     3.59 | 
| pear   |     2.14 | 
Step two is to select the rest of the row by joining these results back to the same table. Since the first query is grouped, it needs to be put into a subquery so it can be joined against the non-grouped table:
select f.type, f.variety, f.price
from (
   select type, min(price) as minprice
   from fruits group by type
) as x inner join fruits as f on f.type = x.type and f.price = x.minprice;
| type   | variety  | price |
| apple  | fuji     |  0.24 | 
| cherry | bing     |  2.55 | 
| orange | valencia |  3.59 | 
| pear   | bartlett |  2.14 | 
Another common way to do this is with a correlated subquery. This can be much less efficient, depending on how good your system’s query optimizer is. You might find it clearer, though.
select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type);
| type   | variety  | price |
| apple  | fuji     |  0.24 | 
| orange | valencia |  3.59 | 
| pear   | bartlett |  2.14 | 
| cherry | bing     |  2.55 | 
Both queries are logically equivalent, though they may not perform the same.

Select the top N rows from each group

This is a slightly harder problem to solve. Finding a single row from each group is easy with SQL’s aggregate functions (MIN(), MAX(), and so on). Finding the first several from each group is not possible with that method because aggregate functions only return a single value. Still, it’s possible to do.
Let’s say I want to select the two cheapest fruits from each type. Here’s a first try:
select type, variety, price
from fruits
where price = (select min(price) from fruits as f where f.type = fruits.type)
   or price = (select min(price) from fruits as f where f.type = fruits.type
      and price > (select min(price) from fruits as f2 where f2.type = fruits.type));
| type   | variety  | price |
| apple  | gala     |  2.79 | 
| apple  | fuji     |  0.24 | 
| orange | valencia |  3.59 | 
| orange | navel    |  9.36 | 
| pear   | bradford |  6.05 | 
| pear   | bartlett |  2.14 | 
| cherry | bing     |  2.55 | 
| cherry | chelan   |  6.33 | 
Yuck! That can be written as a self-join, but it’s just as bad (I leave it as an exercise for the reader). This gets worse as you go to higher numbers (top 3, top 4…). There are other ways to phrase the statement, but they all boil down to the same thing, and they’re all pretty unwieldy and inefficient.
There’s a better way: select the variety from each type where the variety is no more than the second-cheapest of that type.
select type, variety, price
from fruits
where (
   select count(*) from fruits as f
   where f.type = fruits.type and f.price <= fruits.price
) <= 2;
This is elegant, and lets you vary N without rewriting your query (a very good thing!), but it’s functionally the same as the previous query. Both are essentially a quadratic algorithm relative to the number of varieties in each type. And again, some query optimizers may not do well with this and make it quadratic with respect to the number of rows in the table overall (especially if no useful index is defined), and the server might get clobbered. Are there better ways? Can it be done with one pass through the data, instead of the many passes required by a correlated subquery? You know it can, or I wouldn’t be writing this, now would I?


If there’s an index on (type, price), and there are many more records to eliminate than to include in each group, a more efficient single-pass method (especially for MySQL, but also for some other RDBMSs) is to break the queries out separately and put a limit on each, then UNION them all back together. Here’s the syntax you need for MySQL:
(select * from fruits where type = 'apple' order by price limit 2)
union all
(select * from fruits where type = 'orange' order by price limit 2)
union all
(select * from fruits where type = 'pear' order by price limit 2)
union all
(select * from fruits where type = 'cherry' order by price limit 2)
Peter Zaitsev has written in detail about this technique, so I won’t go into it too much more here. If it suits your purposes, it can be a very good solution.
One note: use UNION ALL, not just UNION. It prevents the server sorting the results to eliminate duplicates before returning them. In this case there will be no duplicates, so I’m telling the server to skip that (useless, expensive) step.

Do it with user variables on MySQL

The UNION trick is an especially good idea when the results are a small fraction of the rows in the table and there is an index that can be used for sorting the rows. Another linear-time technique, which might be a good option in cases where you are selecting most of the rows from the table anyway, is user variables. This is MySQL-specific. Please refer to my previous post on how to number rows in MySQL for the gory details of why this works:
set @num := 0, @type := '';

select type, variety, price
from (
   select type, variety, price,
      @num := if(@type = type, @num + 1, 1) as row_number,
      @type := type as dummy
  from fruits
  order by type, price
) as x where x.row_number <= 2;
This isn’t one pass through the table, by the way. The subquery is implemented as a temporary table behind the scenes, so filling it with data is one pass; then selecting every row from it and applying the WHERE clause is another. However, twice through is still O(n) with respect to the table size. That’s a lot better than correlated subqueries, which are O(n2) with respect to the group size – even moderate group sizes cause bad performance (say there are five varieties of each fruit. That’s on the order of 25 passes through the table, all told).

One-pass technique on MySQL… maybe?

If you want to leave your queries up the the query optimizer’s whims, you can try this technique, which builds no temporary tables and makes just one pass through:
set @num := 0, @type := '';

select type, variety, price,
      @num := if(@type = type, @num + 1, 1) as row_number,
      @type := type as dummy
from fruits
group by type, price, variety
having row_number <= 2;
This theoretically ought to work if MySQL orders by the GROUP BY criteria, which it sometimes does for efficiency and to produce the expected results. Does it work? Here’s what it returns on MySQL 5.0.27 on Windows:
| type   | variety  | price | row_number | dummy  |
| apple  | gala     |  2.79 |          1 | apple  |
| apple  | fuji     |  0.24 |          3 | apple  |
| orange | valencia |  3.59 |          1 | orange |
| orange | navel    |  9.36 |          3 | orange |
| pear   | bradford |  6.05 |          1 | pear   |
| pear   | bartlett |  2.14 |          3 | pear   |
| cherry | bing     |  2.55 |          1 | cherry |
| cherry | chelan   |  6.33 |          3 | cherry |
Look closely… it’s returning rows one and three from each group, and they’re not numbered in order of increasing price? Huh? But the HAVING clause says the row_number should be no greater than 2! Here’s what it returns on version 5.0.24a on Ubuntu:
| type   | variety    | price | row_number | dummy  |
| apple  | fuji       |  0.24 |          1 | apple  |
| apple  | gala       |  2.79 |          1 | apple  |
| apple  | limbertwig |  2.87 |          1 | apple  |
| cherry | bing       |  2.55 |          1 | cherry |
| cherry | chelan     |  6.33 |          1 | cherry |
| orange | valencia   |  3.59 |          1 | orange |
| orange | navel      |  9.36 |          1 | orange |
| pear   | bartlett   |  2.14 |          1 | pear   |
| pear   | bradford   |  6.05 |          1 | pear   |
Look, this time everything is numbered 1 and every row is returned. Wonky. This is exactly what the MySQL manual page on user variables warns about.
This technique is pretty much non-deterministic, because it relies on things that you and I don’t get to control directly, such as which indexes MySQL decides to use for grouping. However, if you need to use it – and I know there are some folks out there who do, because I’ve consulted for them – you can still tweak it. We’re getting into the realm of really bastardizing SQL, but the results above came from a table without indexes other than the primary key on (type, variety). What happens if I add an index MySQL can use for grouping?
alter table fruits add key(type, price);
Nothing changes, and EXPLAIN shows why: the query doesn’t use the index I just added. Why? Because the grouping is on three columns, and the index is only on two. In fact, the query is using a temp table and filesort anyway, so this is still not achieving the once-through goal. I can force it to use the index:
set @num := 0, @type := '';

select type, variety, price,
      @num := if(@type = type, @num + 1, 1) as row_number,
      @type := type as dummy
from fruits force index(type)
group by type, price, variety
having row_number <= 2;
Let’s see if that works:
| type   | variety  | price | row_number | dummy  |
| apple  | fuji     |  0.24 |          1 | apple  | 
| apple  | gala     |  2.79 |          2 | apple  | 
| cherry | bing     |  2.55 |          1 | cherry | 
| cherry | chelan   |  6.33 |          2 | cherry | 
| orange | valencia |  3.59 |          1 | orange | 
| orange | navel    |  9.36 |          2 | orange | 
| pear   | bartlett |  2.14 |          1 | pear   | 
| pear   | bradford |  6.05 |          2 | pear   | 
Ah, now we’re cooking! It did what I wanted, without a filesort or temporary table. Another way to do this, by the way, is to take variety out of the GROUP BY so it uses the index on its own. Because this selects a non-grouped column from a grouped query, this only works if you are running with ONLY_FULL_GROUP_BY mode turned off, which I hope you are not doing without good reason.

Source: http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

Improved Persistent Login Cookie Best Practice

  1. When the user successfully logs in with Remember Me checked, a login cookie is issued in addition to the standard session management cookie.
  2. The login cookie contains the user's username, a series identifier, and a token. The series and token are unguessable random numbers from a suitably large space. All three are stored together in a database table.
  3. When a non-logged-in user visits the site and presents a login cookie, the username, series, and token are looked up in the database.
    1. If the triplet is present, the user is considered authenticated. The used token is removed from the database. A new token is generated, stored in database with the username and the same series identifier, and a new login cookie containing all three is issued to the user.
    2. If the username and series are present but the token does not match, a theft is assumed. The user receives a strongly worded warning and all of the user's remembered sessions are deleted.
    3. If the username and series are not present, the login cookie is ignored.



Не зважаючи на завантаженість, купу проблем, курс гривні і на те, що начальник – підарас, знайдіть якусь хвилину часу, поїдьте на Видубичі і просто поговоріть з деревами… Ні, я не йопнувся…
Просто підійдіть і скажіть дереву: “Привіт, дерево! Ти пожовкло вже…” - Да, відповість дерево, - бо осінь, і Покрова”…. “То нічого”, - скажете ви, - Ти оживеш навесні і знову будеш зелене…” – Я знаю, - відповість дерево, - навесні всі оживають… Хоча не факт. Був у мене дід, він теж так казав, а на Різдво, ні з того - ні з сього впав на землю прямо в сніг… Правда, навколо нього тепер якісь нові дерева наросли, мабуть, онуки… -Мабуть, - скажете ви йому…
А ти знаєш, дерево, що на початку Андріївського узвозу росте стара липа… Її посадив ще, по-моєму, Петро Могила. І досі жива вона… А коло Полтави ростуть дуби Кочубея. Дуже давні… Але – живі… - Та знаю-знаю, відповість дерево… “По-моєму це чистий PR і реклама…” – Я даже про баобаб, чула, - скаже дерево… В Африці росте. То йому даже по тисячі год буває… Ну лічно ніколи не бачила… Баб бачила (вони в церкву приходять) а баобаб – ні… - Може, це якась лєгєнда і їх немає насправді… Може, й лєгенда, - скажете ви йому, щоб не розстраювать… А ти знаєш, дерево, я колись був в Африці і склав вірш про дерева… Який? - запитає дерево… -“Африканські чорні баби сильно люблять баобаби. Там на кожну з чорних баб, припадає баобаб. А українські баби люблять ягоди й гриби”… - Такоє, - відповість дерево… Ліріка… Кстаті, подивись там внизу у мене гриб росте… Красівий такий - червона шапка і білий горошок по ній… Можеш взять, єслі хочеш… - Добре! - скажете ви, - знову ж таки, щоб не розстраювать його…
Ну, пока, дерево! Побачимося навесні… - Пока! Давай побачимся, спасіба що прийшов… З нами рідко балакають, ну ми любимо поболтать вообще-то… Тільки не кажи нікому… - Ні, не скажу… Бувай!
Тепер можна обнять дерево… І навіть поцілувать його в кору… Щоб ніхто не бачив (бо, можуть мінтів визвать)… І пожать йому руку… В смислі лапу... В смислі – гілку…
Цілком можливо, що дерево вам не відповість нічого. Ну, це не важно… Це все-одно допомагає… На цілу зиму…

(С) Vitalii Chepynoga


JVM tweaks


DCEVM - Dynamic Code Evolution VM

When a java developer works, she spends a lot of time reloading the server context.

The Dynamic Code Evolution Virtual Machine (DCE VM) is a modification of the Java HotSpot(TM) VM that allows unlimited redefinition of loaded classes at runtime. The current hotswapping mechanism of the HotSpot(TM) VM allows only changing method bodies. Our enhanced VM allows adding and removing fields and methods as well as changes to the super types of a class.

It is an open source project released under the GPL v2.0 license.

There's a handy plugin for Intellij, that just downloads DCEVM JRE and registers it inside IDE.
Installation is performed with these steps:
  • Install a DCEVM in Intellij as follows: File -> Settings -> Plugins -> search for DCEVM -> Install.
  • Restart the program and see in the Event Log something like that:
    "DCEVM is available for your environment: Would you like to download it?" 
  • Downloading will be performed into something like that: c:\Users\anton\.IntelliJIdea13\config\plugins\DCEVM_JRE
  • Now that we have a DCEVM JRE installed, let’s update the configuration.
  • Edit Tomcat debug configurations by specifying new JRE. 
  • Run/Debug Configurations -> Server tab -> Use alternative JRE: c:\Users\anton\.IntelliJIdea13\config\plugins\DCEVM_JRE
  • Run/Debug Configurations -> Server tab -> "On frame deactivation" –> update classes and resources
Now run the project in debug option and try to changes some classes and wait until the code reloads.

Consider tweaking your JAVA_OPTS as described in this post


Just a good skeleton for a new Scala project

This project aims to simplify creating a project from scratch.It's simpler than g8 or typesafe activator because it doesn't require any additional tools, just clone, build and run. It provides a simple way to configure:


Originally coded by: https://github.com/fractal


Implicit Parameters vs Default Parameter Values

There are, at least, two techniques in Scala to pass default value to a method

1) default parameter value
scala> def f(i: Int = 0) = i
f: (i: Int)Int

scala> f()
res0: Int = 0

scala> f(1)
res1: Int = 1
2) implicit parameter
scala> def g(implicit i: Int) = i
g: (implicit i: Int)Int

scala> implicit val default = 0
default: Int = 0

scala> g(1)
res5: Int = 1

scala> g
res7: Int = 0
In which case do you choose one or another ? With the power of implicit, default values are they a usefull feature ?

You should definitely prefer default parameter value.
  1. You should never create or use implicit parameters of general types like Int or String. See citation below.
  2. Default value is the simplest solution. In terms of language features complexity.
  3. Implicit parameters are for some kind of "context" for your method. If there is no context, just default value you can confuse other developers.
  4. Implicit value search will cost you some amount of compilation time.
  5. Implicit parameters should be specified manually only in rare cases.
Real world examples of implicit params: Akka actors ActorRef method

def !(message: Any)(implicit sender: ActorRef = Actor.noSender): Unit

Here we are automatically pass along the sender reference while sending a message to another Actor.
This code will, if invoked from within Actor 'A' automatically pass along the ActorRef of Actor 'A' as the sender of this message:

// From within an Actor
greeter ! Greet


Scala Traits

Scala traits are like Java interfaces with concrete methods, but they can actually do much more.

Traits can, for example, declare fields and maintain state.
In fact, you can do anything in a trait definition that you can do in a class definition, and the syntax looks exactly the same, with only two exceptions.

First, a trait cannot have any "class" parameters, i.e., parameters passed to the primary constructor of a class. In other words, although you could define a class like this:
  class Point(x: Int, y: Int)
The following attempt to define a trait would not compile:
  trait NoPoint(x: Int, y: Int// Does not compile

The other difference between classes and traits is that whereas in classes, super calls are statically bound, in traits, they are dynamically bound. If you write "super.toString" in a class, you know exactly which method implementation will be invoked. When you write the same thing in a trait, however, the method implementation to invoke for the super call is undefined when you define the trait. Rather, the implementation to invoke will be determined anew each time the trait is mixed into a concrete class. This curious behavior of super is key to allowing traits to work as stackable modifications.
Traits are a way to inherit from multiple class-like constructs, but they differ in important ways from the multiple inheritance present in many languages. One difference is especially important: the interpretation of super. With multiple inheritance, the method called by a super call can be determined right where the call appears. With traits, the method called is determined by a linearization of the classes and traits that are mixed into a class. This is the difference that enables the stacking of modifications described in the previous section.

When you instantiate a class with new, Scala takes the class and all of its inherited classes and traits and puts them in a single, linear order. Then, whenever you call super inside one of those classes, the invoked method is the next one up the chain. If all of the methods but the last call super, the net result is stackable behavior.

The precise order of the linearization is described in the language specification. It is a little bit complicated, but the main thing you need to know is that, in any linearization, a class is always linearized before all of its superclasses and mixed in traits. Thus, when you write a method that calls super, that method is definitely modifying the behavior of the superclasses and mixed in traits, not the other way around.

Linearization Rules

Scala's linearization rules are described starting on page 49 of the Scala Language Specification (SLS), Chapter 5, "Classes and Objects".

In order to allow reuse of compiled classes and to ensure well-defined behavior, the linearization must satisfy a few rules:
  • The linearization of any class must include unmodified the linearization of any class (but not trait) it extends.
  • The linearization of any class must include all classes and mixin traits in the linearization of any trait it extends, but the mixin traits need not be in the same order as they appear in the linearization of the traits being mixed in.
  • No class or trait may appear more than once in the linearization.
The main properties of Scala's linearization are illustrated by the following example: Say you have a class Cat, which inherits from a superclass Animal and two traits Furry and FourLegged. FourLegged extends in turn another trait HasLegs:
  class Animal 
  trait Furry extends Animal
  trait HasLegs extends Animal
  trait FourLegged extends HasLegs
  class Cat extends Animal with Furry with FourLegged
Class Cat's inheritance hierarchy and linearization are shown in Figure 12.1. Inheritance is indicated using traditional UML notation:[3] arrows with white, triangular arrowheads indicate inheritance, with the arrowhead pointing to the supertype. The arrows with darkened, non-triangular arrowheads depict linearization. The darkened arrowheads point in the direction in which super calls will be resolved.
image images/linearization.jpg
Figure 12.1 - Inheritance hierarchy and linearization of class Cat.
The linearization of Cat is computed from back to front as follows. The last part of the linearization of Cat is the linearization of its superclass, Animal. This linearization is copied over without any changes. (The linearization of each of these types is shown in Table 12.1 here.) Because Animal doesn't explicitly extend a superclass or mix in any supertraits, it by default extends AnyRef, which extends Any. Animal's linearization, therefore, looks like:
image images/AnimalLine.jpg
The second to last part is the linearization of the first mixin, trait Furry, but all classes that are already in the linearization of Animal are left out now, so that each class appears only once in Cat's linearization. The result is:
image images/FurryLine.jpg
This is preceded by the linearization of FourLegged, where again any classes that have already been copied in the linearizations of the superclass or the first mixin are left out:
image images/FourLeggedLine.jpg
Finally, the first class in the linearization of Cat is Cat itself:
image images/CatLine.jpg
When any of these classes and traits invokes a method via super, the implementation invoked will be the first implementation to its right in the linearization.

12.7 To trait, or not to trait? [link]

Whenever you implement a reusable collection of behavior, you will have to decide whether you want to use a trait or an abstract class. There is no firm rule, but this section contains a few guidelines to consider.

  1. If the behavior will not be reused, then make it a concrete class. It is not reusable behavior after all.
  2. If it might be reused in multiple, unrelated classes, make it a trait. Only traits can be mixed into different parts of the class hierarchy.
  3. If you want to inherit from it in Java code, use an abstract class. Since traits with code do not have a close Java analog, it tends to be awkward to inherit from a trait in a Java class. Inheriting from a Scala class, meanwhile, is exactly like inheriting from a Java class. As one exception, a Scala trait with only abstract members translates directly to a Java interface, so you should feel free to define such traits even if you expect Java code to inherit from it. See Chapter 29 for more information on working with Java and Scala together.
  4. If you plan to distribute it in compiled form, and you expect outside groups to write classes inheriting from it, you might lean towards using an abstract class. The issue is that when a trait gains or loses a member, any classes that inherit from it must be recompiled, even if they have not changed. If outside clients will only call into the behavior, instead of inheriting from it, then using a trait is fine.
  5. If efficiency is very important, lean towards using a class. Most Java runtimes make a virtual method invocation of a class member a faster operation than an interface method invocation. Traits get compiled to interfaces and therefore may pay a slight performance overhead. However, you should make this choice only if you know that the trait in question constitutes a performance bottleneck and have evidence that using a class instead actually solves the problem.
  6. If you still do not know, after considering the above, then start by making it as a trait. You can always change it later, and in general using a trait keeps more options open.

More here: http://www.artima.com/pins1ed/traits.html

Scala Case Class

Case classes can be seen as plain and immutable data-holding objects that should exclusively depend on their constructor arguments.
This functional concept allows us to
  • use a compact initialisation syntax (Node(1, Leaf(2), None)))
  • decompose them using pattern matching
  • have equality comparisons implicitly defined
In combination with inheritance, case classes are used to mimic algebraic datatypes.
If an object performs stateful computations on the inside or exhibits other kinds of complex behaviour, it should be an ordinary class.
Technically, there is no difference between a class and a case class -- even if the compiler does optimize some stuff when using case classes. However, a case class is used to do away with boiler plate for a specific pattern, which is implementing algebraic data types.
A very simple example of such types are trees. A binary tree, for instance, can be implemented like this:
sealed abstract class Tree
case class Node(left: Tree, right: Tree) extends Tree
case class Leaf[A](value: A) extends Tree
case object EmptyLeaf extends Tree
That enable us to do the following:
// DSL-like assignment:
val treeA = Node(EmptyLeaf, Leaf(5))
val treeB = Node(Node(Leaf(2), Leaf(3)), Leaf(5))

// On Scala 2.8, modification through cloning:
val treeC = treeA.copy(left = treeB.left)

// Pretty printing:
println("Tree A: "+treeA)
println("Tree B: "+treeB)
println("Tree C: "+treeC)

// Comparison:
println("Tree A == Tree B: %s" format (treeA == treeB).toString)
println("Tree B == Tree C: %s" format (treeB == treeC).toString)

// Pattern matching:
treeA match {
  case Node(EmptyLeaf, right) => println("Can be reduced to "+right)
  case Node(left, EmptyLeaf) => println("Can be reduced to "+left)
  case _ => println(treeA+" cannot be reduced")

// Pattern matches can be safely done, because the compiler warns about
// non-exaustive matches:
def checkTree(t: Tree) = t match {
  case Node(EmptyLeaf, Node(left, right)) =>
  // case Node(EmptyLeaf, Leaf(el)) =>
  case Node(Node(left, right), EmptyLeaf) =>
  case Node(Leaf(el), EmptyLeaf) =>
  case Node(Node(l1, r1), Node(l2, r2)) =>
  case Node(Leaf(e1), Leaf(e2)) =>
  case Node(Node(left, right), Leaf(el)) =>
  case Node(Leaf(el), Node(left, right)) =>
  // case Node(EmptyLeaf, EmptyLeaf) =>
  case Leaf(el) =>
  case EmptyLeaf =>
Note that trees construct and deconstruct (through pattern match) with the same syntax, which is also exactly how they are printed (minus spaces).
And they can also be used with hash maps or sets, since they have a valid, stable hashCode.
  • Case classes can be pattern matched
  • Case classes automatically define hashcode and equals
  • Case classes automatically define getter methods for the constructor arguments.

Those are the only differences to regular classes.

Regexp Lookahead & Lookbehind

Source: http://denis-zhdanov.blogspot.com/2009/10/regexp-lookahead-and-lookbehind.html
Source: http://www.regular-expressions.info/lookaround.html


Generally speaking 'lookahead' allows to check if subsequent input characters match particular regexp. For example, let's consider word 'murmur' as our input and capitalize all 'r' symbols that are not followed by 'm' symbol (i.e. we expect to get 'murmuR' as the output). Here is a naive approach:

    System.out.println("murmur".replaceAll("r[^m]", "R"));

However, it doesn't perform any change, i.e. 'murmur' is printed. The reason is that 'r[^m]' means that the pattern is 'r' symbol followed by the symbol over than 'm'. There is no symbol after the last 'r' symbol, so, the pattern is not matched.

Here lookahead comes to the rescue - it allows to check if subsequent symbol(s) match to the provided regexp without making matched subsequent character(s) part of the match. It may be 'positive' or 'negative', i.e. allows to define if we're interested in match or unmatch. Here is the example:

public static void main(String[] args) throws Exception {
    // 'Negative' lookahead, 'r' is not followed by 'm'.    System.out.println("murmur".replaceAll("r(?!m)", "R")); // prints 'murmuR'
    // 'Positive' lookahead, 'r' is followed by 'm'.    System.out.println("murmur".replaceAll("r(?=m)", "R")); // prints 'muRmur'
    // It's possible to use regexp as 'lookahead' pattern    System.out.println("murmur".replaceAll("m(?!u[^u]+u)", "M")); // prints 'murMur'}


'Lookbehind' behaves very similar to 'lookahead' but works backwards. Another difference is that it's possible to define only finite repetition regexp as 'lookbehind' pattern:

public static void main(String[] args) throws Exception {
    // 'Negative' lookbehind, 'c' is not preceded by 'b'.    System.out.println("abcadcaeec".replaceAll("(?<!b)c", "C")); // prints 'abcadCaeeC'
    // 'Positive' lookbehind, 'c' is preceded by 'b'.    System.out.println("abcadcaeec".replaceAll("(?<=b)c", "C")); // prints 'abCadcaeec'
    // It's possible to use finite repetition regexp as 'lookbehind' pattern    System.out.println("abcadcaec".replaceAll("(?<=e{2})c", "C")); // doesn't match because there is no 'c' preceded by two 'e'
    // It's not possible to use regexp that doesn't imply obvious max length as 'lookbehind' pattern    System.out.println("abcadcaec".replaceAll("(?<=a[^a]+ae?)c", "C")); // PatternSyntaxException is thrown here}


Is Java compiled or interpreted? - Both

Depending on the execution environment, bytecode can be:
  • compiled (to the native instructions understood by the CPU) ahead of time and executed as native code (similar to C++)
  • compiled just-in-time and executed
  • interpreted
  • directly executed by a supported processor (bytecode is the native instruction set of some CPUs)
Some HotSpot JVMs start out by interpreting bytecodes, and only compiles them to native code after they have figured out what is worth compiling, and gathered some stats on how the code is being run.

Git Theme

  ui = auto
[color "branch"]
  current = yellow reverse
  local = yellow
  remote = green
[color "diff"]
  meta = yellow bold
  frag = magenta bold
  old = red bold
  new = green bold
[color "status"]
  added = yellow
  changed = green
  untracked = cyan


Transaction Isolation Levels

Transaction Isolation Levels

The ANSI/ISO SQL standard defines four levels of transaction isolation, with different possible outcomes for the same transaction scenario. That is, the same work performed in the same fashion with the same inputs may result in different answers, depending on your isolation level. These levels are defined in terms of three phenomena that are either permitted or not at a given isolation level:
  • Dirty read: A transaction reads data that has been written by another transaction that has not been committed yet.
    Data integrity is compromised, foreign keys are violated, and unique constraints are ignored.
  • Nonrepeatable (fuzzy) reads: A transaction re-reads data it has previously read and finds that another committed transaction has modified or deleted the data.
  • Phantom read: A transaction re-runs a query returning a set of rows that satisfies a search condition and finds that another committed transaction has inserted additional rows that satisfy the condition.

The SQL isolation levels are defined based on whether they allow each of the preceding phenomena. It's interesting to note that the SQL standard doesn't impose a specific locking scheme or mandate particular behaviors, but rather describes these isolation levels in terms of these phenomena—allowing for many different locking/concurrency mechanisms to exist (see Table 1).

Isolation Level Dirty Read Nonrepeatable Read Phantom Read
READ UNCOMMITTED Permitted Permitted Permitted
READ COMMITTED -- Permitted Permitted
REPEATABLE READ -- -- Permitted

 Table 1: ANSI isolation levels

READ UNCOMMITTED - no lock on table
READ COMMITTED - lock on committed data
REPEATABLE READ - lock on block of sql(which is selected by using select query)
SERIALIZABLE - lock on full table(on which Select query is fired)

Oracle Database offers the read committed and serializable isolation levels, as well as a read-only mode that is not part of SQL92. Read committed is the default.

Overview of Locking Mechanisms

In general, multiuser databases use some form of data locking to solve the problems associated with data concurrency, consistency, and integrity. Locks are mechanisms that prevent destructive interaction between transactions accessing the same resource.

Resources include two general types of objects:
  • User objects, such as tables and rows (structures and data)
  • System objects not visible to users, such as shared data structures in the memory and data dictionary rows

Locking Policy

A locking policy is an important component of any multiuser application. When users share objects in an application, a locking policy ensures that two or more users do not attempt to modify the same object or its underlying data simultaneously.

Types of locking policy:
  • Optimistic (Write) Lock: All users have read access to the object. When a user attempts to write a change, the application checks to ensure that the object has not changed since the last read (checking by version number, date, timestamps, or checksums/hashes).
  • Optimistic Read Lock: As with optimistic lock, the optimistic read lock ensures that the object has not changed before writing a change. However, the optimistic read lock also forces a read of any related tables that contribute information to the object.
  • Pessimistic Locking: When a user accesses an object to update it, the database locks the object until the update is completed. No other user can read or update the object until the first user releases the lock. The database offers this locking type.
  • No Locking: The application does not verify that data is current.
Optimistic assumes that nothing's going to change while you're reading it.
Pessimistic assumes that something will and so locks it.

Optimistic Locking details: When you write the record back you filter the update on the version to make sure it's atomic. (i.e. hasn't been updated between when you check the version and write the record to the disk) and update the version in one hit.
If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.

This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.

Advantages of optimistic locking:
  • It prevents users and applications from editing stale data.
  • It notifies users of any locking violation immediately, when updating the object.
  • It does not require you to lock up the database resource.
  • It prevents database deadlocks.
Disadvantages of optimistic locking:
  • Optimistic locking cannot prevent applications from selecting and attempting to modify the same data. When two different processes modify data, the first one to commit the changes succeeds; the other process fails and receives an OptimisticLockException. 
Pessimistic Locking details is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.

Because pessimistic locks exist for the duration of the current transaction, the associated database transaction remains open from the point of the first lock request until the transaction commits. When the transaction commits or rolls back, the database releases the locks.

Advantages of Pessimistic Locking
  • It prevents users and applications from editing data that is being or has been changed.
  • Processes know immediately when a locking violation occurs, rather than after the transaction is complete.
Disadvantages of Pessimistic Locking
  • It is not fully supported by all databases.
  • It consumes extra database resources.
  • It requires to maintain an open transaction and database lock for the duration of the transaction, which can lead to database deadlocks.
  • It decreases the concurrency of connection pooling when using the server session, which affects the overall scalability of your application.
In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID. This is how distributed transaction using two-phase commit protocols (such as XA or COM+ Transactions) work.

Resolving conflicts between transactions

How can you serialize a set of changes from overlapping business transactions that affect the same object when your primary concern is for users to see changes immediately as they occur?
  • First Commit Wins - forcing the user to start a new business transaction, starting from the updated object that results from the first committed transaction.
  • Last Commit Wins - initiating a new server transaction automatically and replaying the user's changes, which may overwrite some of the changes from the first committed transaction.
  • Merge Conflicting Updates - allowing the replay of changes in the last transaction to selectively update the conflicting object. 

First Commit Wins - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. Whenever a client application experiences a commit failure, abort the transaction, and refresh the user's view from the current state of the server. This will cause the user to lose all of their changes, forcing them to start a new business transaction. They will need to reevaluate the new data manually in order to decide whether to redo their changes or not.  

This is a very easy pattern to implement. While it sounds terrible to cause the end user to redo their work, this may be acceptable when the probability of a commit failure is so low that it can be ignored.

Last Commit Wins - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. However, it is possible for the second client to abort the failed transaction, begin a new transaction, replay the changes, and commit the new transaction.

This is an easy pattern to implement, as long as there is a way to define transaction specifications in the client. At first glance, this solution appears to violate the objectives of a transactional system, since the second transaction may overwrite the effects of the first transaction. However, as long as appropriate business rules validate changes as they are played back, there is no risk of changing the repository into an inconsistent state.

One of the disadvantages of this solution is that the software automatically resolves the conflict without notifying the end user that a conflict occurred. In situations where such notification is important or manual merging of changes is more appropriate, then the First Commit Wins or the Merge Conflicting Updates patterns are more appropriate.

The disadvantages of this solution may be acceptable when the probability of a commit failure is so low that it can be ignored or where commit failures typically do not have business consequences. For example, suppose a customer service organization changes addresses and other customer contact information, while a billing organization updates a customer's purchasing information. It is possible that the application server may report a conflict between a customer service transaction and a billing transaction, while from a business perspective the two transactions are updating different aspects of the customer object.

Merge Conflicting Updates - A transactional application server allows the first server transaction to "win" in the sense that successive transactions having a write-write conflict will fail. However, it is possible for the second client to abort the failed transaction, begin a new transaction, replay the changes, and commit the new transaction. While replaying the changes, you must check each individual change to determine whether there is a conflict with the first committed transaction. If so, prompt the user to determine whether or not the new change should be applied.

This solution identifies the server transaction conflict to the user, enabling the user to make the necessary adjustments to merge the two sets of changes appropriately. It minimizes the amount of rework that the end user must do, in order to complete their business transaction.

What is Transaction Context Propagation

Transaction context propagation is necessary if multiple instances are to participate in a single distributed transaction. A transaction context defines the scope of a transaction and associates all participating resources and transactional operations. All participating threads share the same transaction context.

Multiple app instances need to participate in the same transaction if an application instance makes a remote call into another application instance in the scope of an existing transaction, assuming the EJB semantics for the method support scoping work in a client’s transaction. An example of this is a servlet in application instance 1 obtaining a reference to an EJB residing in application instance 2, starting a transaction and making a method call on the remote EJB in the scope of the transaction. When multiple application instances participate in a single transaction, all work done by the participating application instances as part of the global transaction is guaranteed to be atomic. 


  1. http://www.oracle.com/technetwork/issue-archive/2005/05-nov/o65asktom-082389.html,
  2. http://docs.oracle.com/cd/B28359_01/server.111/b28318/consist.htm#CNCPT1314,
  3. http://docs.oracle.com/cd/B14099_19/web.1012/b15901/dataaccs008.htm,
  4. http://docs.spring.io/spring/docs/current/spring-framework-reference/html/transaction.html,
  5. http://docs.spring.io/spring/docs/current/spring-framework-reference/html/transaction.html#tx-propagation 


Plain English explanation of Big O

What is a plain English explanation of Big O? With as little formal definition as possible and simple mathematics

The simplest definition I can give for Big-O notation is this:
Big-O notation is a relative representation of the complexity of an algorithm.
There are some important and deliberately chosen words in that sentence:
  • relative: you can only compare apples to apples. You can't compare an algorithm to do arithmetic multiplication to an algorithm that sorts a list of integers. But a comparison of two algorithms to do arithmetic operations (one multiplication, one addition) will tell you something meaningful;
  • representation: Big-O (in its simplest form) reduces the comparison between algorithms to a single variable. That variable is chosen based on observations or assumptions. For example, sorting algorithms are typically compared based on comparison operations (comparing two nodes to determine their relative ordering). This assumes that comparison is expensive. But what if comparison is cheap but swapping is expensive? It changes the comparison; and
  • complexity: if it takes me one second to sort 10,000 elements how long will it take me to sort one million? Complexity in this instance is a relative measure to something else.
Come back and reread the above when you've read the rest.
The best example of Big-O I can think of is doing arithmetic. Take two numbers (123456 and 789012). The basic arithmetic operations we learnt in school were:
  • addition;
  • subtraction;
  • multiplication; and
  • division.
Each of these is an operation or a problem. A method of solving these is called an algorithm.
Addition is the simplest. You line the numbers up (to the right) and add the digits in a column writing the last number of that addition in the result. The 'tens' part of that number is carried over to the next column.
Let's assume that the addition of these numbers is the most expensive operation in this algorithm. It stands to reason that to add these two numbers together we have to add together 6 digits (and possibly carry a 7th). If we add two 100 digit numbers together we have to do 100 additions. If we add two 10,000 digit numbers we have to do 10,000 additions.
See the pattern? The complexity (being the number of operations) is directly proportional to the number of digits n in the larger number. We call this O(n) or linear complexity.
Subtraction is similar (except you may need to borrow instead of carry).
Multiplication is different. You line the numbers up, take the first digit in the bottom number and multiply it in turn against each digit in the top number and so on through each digit. So to multiply our two 6 digit numbers we must do 36 multiplications. We may need to do as many as 10 or 11 column adds to get the end result too.
If we have two 100-digit numbers we need to do 10,000 multiplications and 200 adds. For two one million digit numbers we need to do one trillion (1012) multiplications and two million adds.
As the algorithm scales with n-squared, this is O(n2) or quadratic complexity. This is a good time to introduce another important concept:
We only care about the most significant portion of complexity.
The astute may have realized that we could express the number of operations as: n2 + 2n. But as you saw from our example with two numbers of a million digits apiece, the second term (2n) becomes insignificant (accounting for 0.0002% of the total operations by that stage).
One can notice that we've assumed the worst case scenario here. While multiplying 6 digit numbers if one of them is 4 digit and the other one is 6 digit, then we only have 24 multiplications. Still we calculate the worst case scenario for that 'n', i.e when both are 6 digit numbers. Hence Big-O notation is about the Worst-case scenario of an algorithm

The Telephone Book

The next best example I can think of is the telephone book, normally called the White Pages or similar but it'll vary from country to country. But I'm talking about the one that lists people by surname and then initials or first name, possibly address and then telephone numbers.
Now if you were instructing a computer to look up the phone number for "John Smith" in a telephone book that contains 1,000,000 names, what would you do? Ignoring the fact that you could guess how far in the S's started (let's assume you can't), what would you do?
A typical implementation might be to open up to the middle, take the 500,000th and compare it to "Smith". If it happens to be "Smith, John", we just got real lucky. Far more likely is that "John Smith" will be before or after that name. If it's after we then divide the last half of the phone book in half and repeat. If it's before then we divide the first half of the phone book in half and repeat. And so on.
This is called a binary search and is used every day in programming whether you realize it or not.
So if you want to find a name in a phone book of a million names you can actually find any name by doing this at most 20 times. In comparing search algorithms we decide that this comparison is our 'n'.
  • For a phone book of 3 names it takes 2 comparisons (at most).
  • For 7 it takes at most 3.
  • For 15 it takes 4.
  • For 1,000,000 it takes 20.
That is staggeringly good isn't it?
In Big-O terms this is O(log n) or logarithmic complexity. Now the logarithm in question could be ln (base e), log10, log2 or some other base. It doesn't matter it's still O(log n) just like O(2n2) and O(100n2) are still both O(n2).
It's worthwhile at this point to explain that Big O can be used to determine three cases with an algorithm:
  • Best Case: In the telephone book search, the best case is that we find the name in one comparison. This is O(1) or constant complexity;
  • Expected Case: As discussed above this is O(log n); and
  • Worst Case: This is also O(log n).
Normally we don't care about the best case. We're interested in the expected and worst case. Sometimes one or the other of these will be more important.
Back to the telephone book.
What if you have a phone number and want to find a name? The police have a reverse phone book but such look-ups are denied to the general public. Or are they? Technically you can reverse look-up a number in an ordinary phone book. How?
You start at the first name and compare the number. If it's a match, great, if not, you move on to the next. You have to do it this way because the phone book is unordered (by phone number anyway).
So to find a name:
  • Best Case: O(1);
  • Expected Case: O(n) (for 500,000); and
  • Worst Case: O(n) (for 1,000,000).

The Travelling Salesman

This is quite a famous problem in computer science and deserves a mention. In this problem you have N towns. Each of those towns is linked to 1 or more other towns by a road of a certain distance. The Travelling Salesman problem is to find the shortest tour that visits every town.
Sounds simple? Think again.
If you have 3 towns A, B and C with roads between all pairs then you could go:
  • A → B → C
  • A → C → B
  • B → C → A
  • B → A → C
  • C → A → B
  • C → B → A
Well actually there's less than that because some of these are equivalent (A → B → C and C → B → A are equivalent, for example, because they use the same roads, just in reverse).
In actuality there are 3 possibilities.
  • Take this to 4 towns and you have (iirc) 12 possibilities.
  • With 5 it's 60.
  • 6 becomes 360.
This is a function of a mathematical operation called a factorial. Basically:
  • 5! = 5 × 4 × 3 × 2 × 1 = 120
  • 6! = 6 × 5 × 4 × 3 × 2 × 1 = 720
  • 7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5040
  • 25! = 25 × 24 × … × 2 × 1 = 15,511,210,043,330,985,984,000,000
  • 50! = 50 × 49 × … × 2 × 1 = 3.04140932 × 1064
So the Big-O of the Travelling Salesman problem is O(n!) or factorial or combinatorial complexity.
By the time you get to 200 towns there isn't enough time left in the universe to solve the problem with traditional computers.
Something to think about.

Polynomial Time

Another point I wanted to make quick mention of is that any algorithm that has a complexity of O(na) is said to have polynomial complexity or is solvable in polynomial time.
Traditional computers can't solve polynomial-time problems. Certain things are used in the world because of this. Public Key Cryptography is a prime example. It is computationally hard to find two prime factors of a very large number. If it wasn't, we couldn't use the public key systems we use.
Anyway, that's it for my (hopefully plain English) explanation of Big O (revised).

It shows how an algorithm scales.
  • 1 item: 1 second
  • 10 items: 100 seconds
  • 100 items: 10000 seconds
  • 1 item: 1 second
  • 10 items: 10 seconds
  • 100 items: 100 seconds
  • 1 item: 1 second
  • 10 items: 1 second
  • 100 items: 1 second
Notice that the number of items increases by a factor of 10, but the time increases by a factor of 102. Basically, n=10 and so O(n2) gives us the scaling factor n2 which is 102. This time the number of items increases by a factor of 10, and so does the time. n=10 and so O(n)'s scaling factor is 10. The number of items is still increasing by a factor of 10, but the scaling factor of O(1) is always 1.
That's the gist of it. They reduce the maths down so it might not be exactly n2 or whatever they say it is, but that'll be the dominating factor in the scaling.

See also:


Maven sources

# mvn dependency:sources
# mvn dependency:resolve -Dclassifier=javadoc


Maven Tips and Tricks: Advanced Reactor Options

Source: http://blog.sonatype.com/2009/10/maven-tips-and-tricks-advanced-reactor-options
Docs: http://maven.apache.org/plugins/maven-reactor-plugin/examples.html

Starting with the Maven 2.1 release, there are new Maven command line options which allow you to manipulate the way that Maven will build multimodule projects. These new options are:
-rf, –resume-from
    Resume reactor from specified project
-pl, –projects
    Build specified reactor projects instead of all projects
-am, –also-make
    If project list is specified, also build projects required by the list
-amd, –also-make-dependents
    If project list is specified, also build projects that depend on projects on the list