At the risk of turning this into a SQL Server blog I have to bring up one more issue that I came across. I was looking at the Microsoft SQL Server functions in PHP trying to figure out how how to get the last inserted id from an identity field. With MySQL you can use mysql_insert_id() in PHP.
A little more time on Google turned up the SELECT @@IDENTITY
method. Then I came across this “Alternatives to @@IDENTITY in SQL Server 2000” page. This revealed that there are three different ways to get the value of an identity field back, each one provides a potentially different answer depending on the conditions:
SELECT @@IDENTITY
SELECT IDENT_CURRENT('tablename')
SELECT SCOPE_IDENTITY()
If you want the details on how each one works go back and read the article. The one that I’m interested in is SELECT SCOPE_IDENTITY()
, it will return the identity value scoped by the connection and statement. This avoids the possible scope changes (think triggers and multiple tables).
So far MySQL hasn’t had to deal with this level of complexity for auto_increment allowing mysql_insert_id() to ignore the scope issues that SQL Server 2000 has. This is already changing though with the introduction of triggers in MySQL 5.0. It isn’t clear any more what mysql_insert_id() is going to do under these conditions. Perhaps they’ll add another function or an option to mysql_insert_id() to make it explicit.
All of this research into identity and auto_increment fields got me to wondering why they don’t just implement sequences like PostgreSQL and Oracle have. Instead of getting the id value after inserting data, you get it before hand by requesting the next value from the sequence. In Oracle you get a new value via something like this:
SELECT sequence_name.nextval FROM dual
PostgreSQL provides a similar method:
SELECT nextval('sequence.name')
To go the extra mile you can use these methods as the default value for your id field which makes it work similar to auto_increment/identity. In PostgreSQL there is a short cut for doing this via the serial data type.
Just in case this wasn’t enough to sell you on the idea of sequences there is more. You can the start value, minimum value, maximum value and increment values. In PostgreSQL you can also indicate some rules on what to do about cycling values. In Oracle the default maximum value for a sequence is 999999999999999999999999999, or if you prefer
nine hundred ninety-nine septillion.
nine hundred ninety-nine sextillion.
nine hundred ninety-nine quintillion.
nine hundred ninety-nine quadrillion.
nine hundred ninety-nine trillion.
nine hundred ninety-nine billion.
nine hundred ninety-nine million.
nine hundred ninety-nine thousand.
nine hundred ninety-nine.
The default maximum value for a sequence is slightly smaller at 9223372036854775807, or
nine quintillion.
two hundred twenty-three quadrillion.
three hundred seventy-two trillion.
thirty-six billion.
eight hundred fifty-four million.
seven hundred seventy-five thousand.
eight hundred seven.
Not bad. Both of those are very large numbers and if you are incrementing by one (which is the default) then you could go for quite awhile before you hit the wall.
So the title of this entry is sequence vs. auto_increment/identity and hopefully you’ll agree that while auto_increment and identities are simpler, their lack of features and problems with scope make them inferior to sequences. With sequences you get more features and less confusion about what they are doing (remember triggers?).
What I’d like to see is MySQL and SQL Server convert over to sequences, but continue to expose the post insert values using the same methods and functions that they do now for auto_increment/identity. They could even use a similar shortcut to PostgreSQL’s serial data type to allow for backwards compatibility when creating tables. This approach would give them the greater functionality of sequences, make them more compatible with PostgreSQL and Oracle (which would making porting apps easier) and retain backwards compatibility with their previous versions.
Any bets on how long it will take for this to happen? Is there a proposed feature list for MySQL 6.0 and SQL Server 2010 🙂
6 replies on “Sequence vs Auto_Increment/Identity”
A nice feature about AUTO_INCREMENT is that you are guaranteed that it will never fail:
Assume that table “foo” has an auto-incremented column “foo_auto”. Even though the AUTO_INCREMENT’s counter may have reached a value of 238 for the next row’s “foo_auto” column, if you insert a row where the value of foo_auto has explicitly been set to 238, AUTO_INCREMENT will “see” this, and insert 239 next time you insert a row where you don’t explicitly state the primary key value.
In other auto-generating schemes, it’s not easy to allow for both auto-generated values and explicitly stated values.
I’ve written about the subject at http://troels.arvin.dk/db/rdbms/#mix-identity
If you want to ensure that you have unique values in a column then you should be placing some sort of unique constraint on it. As you note in the link you provided fields that make use of sequences or identities are usually primary keys, which would include a unique constraint. Having an identity field enforce uniqueness is one more thing to remember. What if you don’t want your identity field to enforce uniqueness? I have no idea why you’d want that, but it is one more thing to remember. Either way I’d contend that the correct way to enforce this is with a unique constraint.
In the link you provided you mention that the standard specifies the identity method and not the sequence method. I’m assuming that you are talking about SQL-2003 since you mention at the top of that page. There is a PDF covering new features in SQL-2003 at http://www.wiscorp.com/SQLStandards.html see the link SQL 2003 Paper (PDF) under Key Readings. According to this PDF both identity (pages 21-22) and sequences (pages 15-20) are included in the standard. I don’t know if Oracle and PostgreSQL sequences meet the spec exactly, but to indicate them as non-standard because the don’t support identity is clearly wrong. This is also supported by the SQL:2003 Has Been Published PDF from ACM which mentions both sequences and identities as part of the SQL 2003 standard.
Overall I like your comparison. The PostgreSQL version you mention is over a year old so there may be some changes that could go into your list.
Yes, PostgreSQL and Oracle have sequences, and I think that they are fairly compliant compared to the SQL standard’s SEQUENCE specs, although I haven’t examined it closely yet.
IDENTITY and SEQUENCEs are not the same thing, although clearly related. The link I referred you to solely examines whether the DBMSes implement the standard’s IDENTITY feature; in cases where they don’t, I mention some alternative ways to obtain somewhat comparable effects.
About “to indicate them as non-standard because the don’t support identity is clearly wrong”, I don’t know what you mean. I don’t write that PostgreSQL and Oracle are non-standard because they don’t support the IDENTITY column attribute. As I write, the IDENTITY feature is a non-core (i.e. optional) SQL:2003 feature.
At this point, my page doesn’t have a section dedicated to SEQUENCES.
What I wanted to point out is that MySQL’s AUTO_INCREMENT feature actually has a nice property: The fact that AUTO_INCREMENT makes it easy to handle situations where the application sometimes inserts automatic, sometimes explicit (non-automatic) values into the AUTO_INCREMENT-column.
You are correct, I see your point. PostgreSQL does not implement the identity feature from the standard. Instead I should have suggested that you include a section on sequences since they are also part of the standard.
[…] sábado, marzo 25 2006 @ 08:40 EST At the risk of turning this into a SQL Server blog I have to bring up one more issue that I came across. I was looking at the Microsoft SQL Server functions in PHP trying to figure out how how to get the last inserted id from an identity field. With MySQL you can use mysql_insert_id() in PHP. Complete story: http://joseph.randomnetworks.com/archives/2006/03/23/sequence-vs-auto_incrementidentity/:: :: […]
It appears that sequences will be a feature of SQL Server 2011…
http://www.sql-server-performance.com/articles/dev/sequence_sql_server_2011_p1.aspx