I have an PostgreSQL server with an existing table which has an fixed-width-non-unique-string column such as this:
| ID_STRING | | 'ABCDEFG' | | 'HIJKLMN' |
Now I want to compute integer ids for each element and store them into an additional column. The result should look like this:
| ID_STRING | ID_INT | 'ABCDEFG' | 1 | 'HIJKLMN' | 2 | 'ABCDEFG' | 1 | 'HIJKLMN' | 2
Is there an easy way to achieve this?
To add the new column use:
ALTER TABLE the_table ADD COLUMN id_int integer;
To populate the new column you need an UPDATE statement.
I am assuming you have a primary key column named
pk_column in your table. Obviously you need to replace that with the actual primary key column in your table.
update the_table set id_int = t.rn from ( select pk_column, dense_rank() over (order by id_string) as rn from the_table ) t where the_table.pk_column = t.pk_column;
If you really have a table without a primary key (why?), you can use the built-in
update the_table set id_int = t.rn from ( select ctid as id_, dense_rank() over (order by id_string) as rn from the_table ) t where the_table.ctid = t.id_;
Your requirement is a little difficult to understand. It seems you want a unique ID value per unique string value, but not unique across the entire data set, i.e. if you have ABCDEF multiple times in the data set, the integer value will be the same across them.
If so, you can use the DENSE_RANK() function to produce an incrementing integer id grouped based on the non-unique strings. Example below:
CREATE TABLE DataTable (NonUniqueString VARCHAR(25)) INSERT INTO DataTable VALUES ('ABCDEF'), ('GHIJKL'), ('ABCDEF'), ('GHIJKL'), ('ABCDEF') SELECT NonUniqueString, DENSE_RANK() OVER (ORDER BY NonUniqueString) AS "Group" FROM DataTable
NonUniqueString Group ------------------------- ABCDEF 1 ABCDEF 1 ABCDEF 1 GHIJKL 2 GHIJKL 2
NOTE: The example was from MS SQL Server but the DENSE_RANK() function should behave the same in PostgreSQL and uses the same syntax.