Create an unique integer id column for an existing string column

by nali   Last Updated July 12, 2019 08:06 AM

I have an PostgreSQL server with an existing table which has an fixed-width-non-unique-string column such as this:

| ID_STRING |
| 'ABCDEFG' |
| 'HIJKLMN' |

Now I want to compute integer ids for each element and store them into an additional column. The result should look like this:

| ID_STRING | ID_INT
| 'ABCDEFG' |   1
| 'HIJKLMN' |   2
| 'ABCDEFG' |   1
| 'HIJKLMN' |   2

Is there an easy way to achieve this?



Answers 2


To add the new column use:

ALTER TABLE the_table ADD COLUMN id_int integer;

To populate the new column you need an UPDATE statement.

I am assuming you have a primary key column named pk_column in your table. Obviously you need to replace that with the actual primary key column in your table.

update the_table
   set id_int = t.rn
from (
  select pk_column, 
         dense_rank() over (order by id_string) as rn
  from the_table
) t
where the_table.pk_column = t.pk_column;

If you really have a table without a primary key (why?), you can use the built-in ctid instead:

update the_table
   set id_int = t.rn
from (
  select ctid as id_, 
         dense_rank() over (order by id_string) as rn
  from the_table
) t
where the_table.ctid = t.id_;
a_horse_with_no_name
a_horse_with_no_name
July 12, 2019 07:31 AM

Your requirement is a little difficult to understand. It seems you want a unique ID value per unique string value, but not unique across the entire data set, i.e. if you have ABCDEF multiple times in the data set, the integer value will be the same across them.

If so, you can use the DENSE_RANK() function to produce an incrementing integer id grouped based on the non-unique strings. Example below:

CREATE TABLE DataTable (NonUniqueString VARCHAR(25))

INSERT INTO DataTable
VALUES ('ABCDEF'), ('GHIJKL'), ('ABCDEF'), ('GHIJKL'), ('ABCDEF')

SELECT NonUniqueString,
    DENSE_RANK() OVER (ORDER BY NonUniqueString) AS "Group"
FROM DataTable

Results:

NonUniqueString     Group
-------------------------
ABCDEF              1
ABCDEF              1
ABCDEF              1
GHIJKL              2
GHIJKL              2

NOTE: The example was from MS SQL Server but the DENSE_RANK() function should behave the same in PostgreSQL and uses the same syntax.

HandyD
HandyD
July 12, 2019 07:33 AM

Related Questions


Updated January 13, 2018 00:06 AM

Updated May 03, 2018 12:06 PM

Updated July 24, 2018 13:06 PM

Updated April 01, 2018 01:06 AM

Updated April 27, 2016 09:02 AM